From paul at tupac.bio Thu Apr 4 17:46:12 2019 From: paul at tupac.bio (Paul Sheridan) Date: Fri, 5 Apr 2019 07:46:12 +0900 Subject: [maker-devel] Running SNAP with MAKER Message-ID: Dear MAKER Team, I am running MAKER 2.31.10 a 32 core instance. My first pass completed successfully. However, my second pass using SNAP and Augustus trained ab initio gene predictions failed. Here is some example output which illustrates the problem: MAKER WARNING: Changes in control files make re-use of all old data impossible All old files will be erased before continuing processing all repeats doing repeat masking doing repeat masking #--------------------------------------------------------------------- Now starting the contig!! SeqID: scf7180000008677_pilon_pilon Length: 49996 #--------------------------------------------------------------------- doing repeat masking preparing ab-inits running snap. #--------- command -------------# Widget::snap: /usr/bin/snap /root/tuna-round-2/genome.maker.output/snap/round1/genome.hmm /tmp/maker_8RuX8Z/scf718 0000006915_pilon_pilon.abinit_masked.0 > /tmp/maker_8RuX8Z/scf7180000006915_pilon_pilon.abinit_maske d.0.genome%2Ehmm.snap #-------------------------------# setting up GFF3 output and fasta chunks processing all repeats doing repeat masking in cluster::shadow_cluster... ...finished clustering. error: unknown command "/root/tuna-round-2/genome.maker.output/snap/round1/genome.hmm", see 'snap help' ERROR: Snap failed --> rank=21, hostname=localhost ERROR: Failed while preparing ab-inits ERROR: Chunk failed at level:0, tier_type:2 FAILED CONTIG:scf7180000007575_pilon_pilon I confirmed that the path to genome.hmm is correct. In addition, run.log contains the following kind of output: STARTED genome.maker.output/genome_datastore/00/6E/scf7180000008677_pilon_pilon//theVoid.scf7180000008677_pilon_pilon/scf7180000008677_pilon_pilon.abinit_m asked.0.genome%2Ehmm.snap DIED RANK 30:4:0:0 DIED COUNT 2 DIED RANK 30 DIED COUNT 2 How can I resolve this issue? Also, is the warning about it being impossible to use the old data to be expected? Attached files: - maker_otps1.ctl: first pass control file - maker_opts2.ctl: second pass control file - run.log: log file for an example contig Thanks in Advance, Paul Sheridan -- CSO at Tupac Bio Email: paul at tupac.bio Homepage: www.paulsheridan.net Mobile: +81 80 7889 0859 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts2.ctl Type: application/octet-stream Size: 4727 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: run.log Type: application/octet-stream Size: 2366 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts1.ctl Type: application/octet-stream Size: 4514 bytes Desc: not available URL: From carsonhh at gmail.com Sat Apr 6 16:00:14 2019 From: carsonhh at gmail.com (Carson Holt) Date: Sat, 6 Apr 2019 15:00:14 -0600 Subject: [maker-devel] Running SNAP with MAKER In-Reply-To: References: Message-ID: <1B661B06-3142-40D8-AEF5-A387397AA91E@gmail.com> The error is being thrown by snap itself. Perhaps there is an issue with the genome.hmm file. Did you generate the file immediately previously to this run? Perhaps you can redo that process, and review any errors that come up during training. Some details on training SNAP from the wiki ?> http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial_for_WGS_Assembly_and_Annotation_Winter_School_2018#Training_ab_initio_Gene_Predictors ?Carson > On Apr 4, 2019, at 4:46 PM, Paul Sheridan wrote: > > Dear MAKER Team, > > I am running MAKER 2.31.10 a 32 core instance. My first pass completed successfully. However, my second pass using SNAP and Augustus trained ab initio gene predictions failed. Here is some example output which illustrates the problem: > > MAKER WARNING: Changes in control files make re-use of all old data impossible > All old files will be erased before continuing > processing all repeats > doing repeat masking > doing repeat masking > #--------------------------------------------------------------------- > Now starting the contig!! > SeqID: scf7180000008677_pilon_pilon > Length: 49996 > #--------------------------------------------------------------------- > > doing repeat masking > preparing ab-inits > running snap. > #--------- command -------------# > Widget::snap: > /usr/bin/snap /root/tuna-round-2/genome.maker.output/snap/round1/genome.hmm /tmp/maker_8RuX8Z/scf718 > 0000006915_pilon_pilon.abinit_masked.0 > /tmp/maker_8RuX8Z/scf7180000006915_pilon_pilon.abinit_maske > d.0.genome%2Ehmm.snap > #-------------------------------# > setting up GFF3 output and fasta chunks > processing all repeats > doing repeat masking > in cluster::shadow_cluster... > ...finished clustering. > error: unknown command "/root/tuna-round-2/genome.maker.output/snap/round1/genome.hmm", see 'snap help' > ERROR: Snap failed > --> rank=21, hostname=localhost > ERROR: Failed while preparing ab-inits > ERROR: Chunk failed at level:0, tier_type:2 > FAILED CONTIG:scf7180000007575_pilon_pilon > > I confirmed that the path to genome.hmm is correct. In addition, run.log contains the following kind of output: > > STARTED genome.maker.output/genome_datastore/00/6E/scf7180000008677_pilon_pilon//theVoid.scf7180000008677_pilon_pilon/scf7180000008677_pilon_pilon.abinit_m > asked.0.genome%2Ehmm.snap > DIED RANK 30:4:0:0 > DIED COUNT 2 > DIED RANK 30 > DIED COUNT 2 > > How can I resolve this issue? > > Also, is the warning about it being impossible to use the old data to be expected? > > Attached files: > - maker_otps1.ctl: first pass control file > - maker_opts2.ctl: second pass control file > - run.log: log file for an example contig > > Thanks in Advance, > > Paul Sheridan > > -- > CSO at Tupac Bio > Email: paul at tupac.bio > Homepage: www.paulsheridan.net > Mobile: +81 80 7889 0859 > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul at tupac.bio Sun Apr 7 04:27:53 2019 From: paul at tupac.bio (Paul Sheridan) Date: Sun, 7 Apr 2019 18:27:53 +0900 Subject: [maker-devel] Running SNAP with MAKER In-Reply-To: <1B661B06-3142-40D8-AEF5-A387397AA91E@gmail.com> References: <1B661B06-3142-40D8-AEF5-A387397AA91E@gmail.com> Message-ID: Hi Carson, Indeed, I did generate the hmm file immediately previously to my second run. I redid the process by following these commands from the link you supplied: mkdir snap cd snap gff3_merge -d /root/tuna-round-2/genome.maker.output/genome_master_datastore_index.log maker2zff genome.all.gff fathom -categorize 1000 genome.ann genome.dna fathom -export 1000 -plus uni.ann uni.dna forge export.ann export.dna hmm-assembler.pl genome . > ../genome1.hmm I didn't find any errors generated by Snap during training. But when I reran MAKER, I got errors of this variety: processing all repeats processing all repeats error: unknown command "/root/tuna-round-2/genome.maker.output/genome1.hmm", see 'snap help' error: unknown command "/root/tuna-round-2/genome.maker.output/genome1.hmm", see 'snap help' preparing masked sequence processing all repeats collecting blastx repeatmasking preparing masked sequence collecting blastx repeatmasking collecting blastx repeatmasking processing all repeats processing all repeats processing all repeats processing all repeats preparing masked sequence collecting blastx repeatmasking ERROR: Snap failed --> rank=21, hostname=localhost ERROR: Failed while preparing ab-inits ERROR: Chunk failed at level:0, tier_type:2 FAILED CONTIG:scf7180000008536_pilon_pilon ERROR: Snap failed --> rank=5, hostname=localhost ERROR: Failed while preparing ab-inits ERROR: Chunk failed at level:0, tier_type:2 FAILED CONTIG:scf7180000008522_pilon_pilon ERROR: Chunk failed at level:4, tier_type:0 FAILED CONTIG:scf7180000008536_pilon_pilon preparing masked sequence ERROR: Chunk failed at level:4, tier_type:0 FAILED CONTIG:scf7180000008522_pilon_pilon Do you have any other suggestions? Thanks in Advance, Paul On Sun, Apr 7, 2019 at 6:00 AM Carson Holt wrote: > The error is being thrown by snap itself. Perhaps there is an issue with > the genome.hmm file. Did you generate the file immediately previously to > this run? Perhaps you can redo that process, and review any errors that > come up during training. > > Some details on training SNAP from the wiki ?> > http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial_for_WGS_Assembly_and_Annotation_Winter_School_2018#Training_ab_initio_Gene_Predictors > > ?Carson > > > On Apr 4, 2019, at 4:46 PM, Paul Sheridan wrote: > > Dear MAKER Team, > > I am running MAKER 2.31.10 a 32 core instance. My first pass completed > successfully. However, my second pass using SNAP and Augustus trained ab > initio gene predictions failed. Here is some example output which > illustrates the problem: > > MAKER WARNING: Changes in control files make re-use of all old data > impossible > All old files will be erased before continuing > processing all repeats > doing repeat masking > doing repeat masking > #--------------------------------------------------------------------- > Now starting the contig!! > SeqID: scf7180000008677_pilon_pilon > Length: 49996 > #--------------------------------------------------------------------- > > doing repeat masking > preparing ab-inits > running snap. > #--------- command -------------# > Widget::snap: > /usr/bin/snap > /root/tuna-round-2/genome.maker.output/snap/round1/genome.hmm > /tmp/maker_8RuX8Z/scf718 > 0000006915_pilon_pilon.abinit_masked.0 > > /tmp/maker_8RuX8Z/scf7180000006915_pilon_pilon.abinit_maske > d.0.genome%2Ehmm.snap > #-------------------------------# > setting up GFF3 output and fasta chunks > processing all repeats > doing repeat masking > in cluster::shadow_cluster... > ...finished clustering. > error: unknown command > "/root/tuna-round-2/genome.maker.output/snap/round1/genome.hmm", see 'snap > help' > ERROR: Snap failed > --> rank=21, hostname=localhost > ERROR: Failed while preparing ab-inits > ERROR: Chunk failed at level:0, tier_type:2 > FAILED CONTIG:scf7180000007575_pilon_pilon > > I confirmed that the path to genome.hmm is correct. In addition, run.log > contains the following kind of output: > > STARTED > genome.maker.output/genome_datastore/00/6E/scf7180000008677_pilon_pilon//theVoid.scf7180000008677_pilon_pilon/scf7180000008677_pilon_pilon.abinit_m > asked.0.genome%2Ehmm.snap > DIED RANK 30:4:0:0 > DIED COUNT 2 > DIED RANK 30 > DIED COUNT 2 > > How can I resolve this issue? > > Also, is the warning about it being impossible to use the old data to be > expected? > > Attached files: > - maker_otps1.ctl: first pass control file > - maker_opts2.ctl: second pass control file > - run.log: log file for an example contig > > Thanks in Advance, > > Paul Sheridan > > -- > CSO at Tupac Bio > Email: paul at tupac.bio > Homepage: www.paulsheridan.net > Mobile: +81 80 7889 0859 > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > -- CSO at Tupac Bio Email: paul at tupac.bio Homepage: www.paulsheridan.net Mobile: +81 80 7889 0859 -------------- next part -------------- An HTML attachment was scrubbed... URL: From liorglic at mail.tau.ac.il Sun Apr 7 08:25:22 2019 From: liorglic at mail.tau.ac.il (Lior Glick) Date: Sun, 7 Apr 2019 16:25:22 +0300 Subject: [maker-devel] Curious pattern in AED distributions Message-ID: Hi MAKER users, Lately I've been performing annotations for multiple genomes from the same species. When plotting the histogram of AED scores over all genes, I repeatedly see a very specific pattern, that looks something like this: [image: AED_hist.png] This pattern is a bit surprising to me, in two aspects: 1) Why is there a surge towards 0.5? 2) Why is there a sudden drop right after that surge? Has anyone else seen this, or is this a specific outcome of my data/configuration? Any ideas of what may cause such a distribution? While this is not necessarily an indication of a problem or bug, it does seem a bit odd, and might imply some bias or artifact. Would appreciate your comments. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: AED_hist.png Type: image/png Size: 8232 bytes Desc: not available URL: From myandell at genetics.utah.edu Sun Apr 7 10:11:36 2019 From: myandell at genetics.utah.edu (Mark Yandell) Date: Sun, 7 Apr 2019 15:11:36 +0000 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: Message-ID: <805319DB-37C6-4802-A5A0-F74BFBD7BAA1@umail.utah.edu> Hi Lior, Fun! The short answer is I don?t know. Obviously, the good stuff is on the right side of 0.5. That said, I can think of a couple of things to look into to explain the left side of the graph. Are you allowing single exon genes? Are you using RNA seq data, protein, or both? What about repeat masking? Are you doing it? Do you have your own library? My first guess, would be low complexity/repeat sequences generating more or less random blastx hits across the genome?Carson, what do you think? And finally, what does the AED look like for the genes included in the final build? Sorry for all the questions, Lior. That?s your punishment for asking an interesting one. ? --mark From: maker-devel on behalf of Lior Glick Date: Sunday, April 7, 2019 at 7:26 AM To: "maker-devel at yandell-lab.org" Subject: [maker-devel] Curious pattern in AED distributions Hi MAKER users, Lately I've been performing annotations for multiple genomes from the same species. When plotting the histogram of AED scores over all genes, I repeatedly see a very specific pattern, that looks something like this: [AED_hist.png] This pattern is a bit surprising to me, in two aspects: 1) Why is there a surge towards 0.5? 2) Why is there a sudden drop right after that surge? Has anyone else seen this, or is this a specific outcome of my data/configuration? Any ideas of what may cause such a distribution? While this is not necessarily an indication of a problem or bug, it does seem a bit odd, and might imply some bias or artifact. Would appreciate your comments. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 8233 bytes Desc: image001.png URL: From myandell at genetics.utah.edu Sun Apr 7 12:39:16 2019 From: myandell at genetics.utah.edu (Mark Yandell) Date: Sun, 7 Apr 2019 17:39:16 +0000 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: <805319DB-37C6-4802-A5A0-F74BFBD7BAA1@umail.utah.edu> Message-ID: <116090CF-13B6-4E54-A5AA-8F7D7FCF2F23@umail.utah.edu> ? Sorry. I?m dyslexic, especially early in the morning. Yes, good stuff is on the left. As regards single exon genes, that?s always a hard call, as these have a higher false positive rate. Things to consider are how prevalent are introns in your org? Cason can give more advice on this point, I?m sure. ? ? By ?"final build", I meant is this using the ?Standard build? or ?Max Build? protocol from PMC4286374? From: Lior Glick Date: Sunday, April 7, 2019 at 10:29 AM To: Mark Yandell Cc: "liorglic at mail.tau.ac.il" , "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] Curious pattern in AED distributions Dear Mark, Thank you for the quick reply. I'm happy to see this ignites your interest and am willing to endure your punishing questions (; Before I answer them, I just want to make sure we're on the same page - as far as I understand, lower AED scores indicate higher agreement with the evidence, so the "good stuff" is actually left of the 0.5 surge. Am I correct? Otherwise, this is a very poor annotation... Now for the questions: 1) I did not make any filtrations so far, so single exon genes are included as well. in fact, I'm exploring the results in order to develop some criteria for filtering the genes. Would you suggest discarding single exon genes? 2) My evidence consist of assembled transcripts, proteins and predicted gene models (pred_gff). 3) As for repeats, I'm masking based on a repeats library obtained from a previous publication, specific to my organism of interest. Unfortunately, I didn't understand your final question. Could you please explain what you mean by "final build"? Hope these answers are helpful, and waiting to hear more thoughts. Thanks again. On Sun, Apr 7, 2019, 18:11 Mark Yandell > wrote: Hi Lior, Fun! The short answer is I don?t know. Obviously, the good stuff is on the right side of 0.5. That said, I can think of a couple of things to look into to explain the left side of the graph. Are you allowing single exon genes? Are you using RNA seq data, protein, or both? What about repeat masking? Are you doing it? Do you have your own library? My first guess, would be low complexity/repeat sequences generating more or less random blastx hits across the genome?Carson, what do you think? And finally, what does the AED look like for the genes included in the final build? Sorry for all the questions, Lior. That?s your punishment for asking an interesting one. ? --mark From: maker-devel > on behalf of Lior Glick > Date: Sunday, April 7, 2019 at 7:26 AM To: "maker-devel at yandell-lab.org" > Subject: [maker-devel] Curious pattern in AED distributions Hi MAKER users, Lately I've been performing annotations for multiple genomes from the same species. When plotting the histogram of AED scores over all genes, I repeatedly see a very specific pattern, that looks something like this: [AED_hist.png] This pattern is a bit surprising to me, in two aspects: 1) Why is there a surge towards 0.5? 2) Why is there a sudden drop right after that surge? Has anyone else seen this, or is this a specific outcome of my data/configuration? Any ideas of what may cause such a distribution? While this is not necessarily an indication of a problem or bug, it does seem a bit odd, and might imply some bias or artifact. Would appreciate your comments. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From ychliu at genetics.ac.cn Tue Apr 2 20:21:33 2019 From: ychliu at genetics.ac.cn (ychliu at genetics.ac.cn) Date: Wed, 3 Apr 2019 09:21:33 +0800 Subject: [maker-devel] MAKER problem with gff3 file Message-ID: <2019040309213197334742@genetics.ac.cn> Dear MAKER developers, I recently use the MAKER to do gene annotation. But even I use the gff3 file as the EST evidence, the result shows no gene that marked by est2genome (I do use the parameter est2genome=1). It may means that the gff3 seems doesn't work. So what's the problem? How can I solve it? Eager for you assistance. Faithfully yours. Yucheng Liu Yucheng Liu Institute of Genetics and Developmental Biology, CAS Beijing, 100101 China Tel: 86-010-64801362 E-mail: ychliu at genetics.ac.cn -------------- next part -------------- An HTML attachment was scrubbed... URL: From liorglck at gmail.com Sun Apr 7 11:29:13 2019 From: liorglck at gmail.com (Lior Glick) Date: Sun, 7 Apr 2019 19:29:13 +0300 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: <805319DB-37C6-4802-A5A0-F74BFBD7BAA1@umail.utah.edu> References: <805319DB-37C6-4802-A5A0-F74BFBD7BAA1@umail.utah.edu> Message-ID: Dear Mark, Thank you for the quick reply. I'm happy to see this ignites your interest and am willing to endure your punishing questions (; Before I answer them, I just want to make sure we're on the same page - as far as I understand, lower AED scores indicate higher agreement with the evidence, so the "good stuff" is actually left of the 0.5 surge. Am I correct? Otherwise, this is a very poor annotation... Now for the questions: 1) I did not make any filtrations so far, so single exon genes are included as well. in fact, I'm exploring the results in order to develop some criteria for filtering the genes. Would you suggest discarding single exon genes? 2) My evidence consist of assembled transcripts, proteins and predicted gene models (pred_gff). 3) As for repeats, I'm masking based on a repeats library obtained from a previous publication, specific to my organism of interest. Unfortunately, I didn't understand your final question. Could you please explain what you mean by "final build"? Hope these answers are helpful, and waiting to hear more thoughts. Thanks again. On Sun, Apr 7, 2019, 18:11 Mark Yandell wrote: > Hi Lior, > > > > > > Fun! The short answer is I don?t know. Obviously, the good stuff is on the > right side of 0.5. > > That said, I can think of a couple of things to look into to explain the > left side of the graph. Are you allowing single exon genes? Are you using > RNA seq data, protein, or both? What about repeat masking? Are you doing > it? Do you have your own library? > > > > My first guess, would be low complexity/repeat sequences generating more > or less random blastx hits across the genome?Carson, what do you think? > > > > And finally, what does the AED look like for the genes included in the > final build? > > > > > > Sorry for all the questions, Lior. That?s your punishment for asking an > interesting one. ? > > > > --mark > > > > > > *From: *maker-devel on behalf of > Lior Glick > *Date: *Sunday, April 7, 2019 at 7:26 AM > *To: *"maker-devel at yandell-lab.org" > *Subject: *[maker-devel] Curious pattern in AED distributions > > > > Hi MAKER users, > > Lately I've been performing annotations for multiple genomes from the same > species. > > When plotting the histogram of AED scores over all genes, I repeatedly see > a very specific pattern, that looks something like this: > > [image: AED_hist.png] > > This pattern is a bit surprising to me, in two aspects: > > 1) Why is there a surge towards 0.5? > > 2) Why is there a sudden drop right after that surge? > > > > Has anyone else seen this, or is this a specific outcome of my > data/configuration? > > Any ideas of what may cause such a distribution? > > > > While this is not necessarily an indication of a problem or bug, it does > seem a bit odd, and might imply some bias or artifact. > > Would appreciate your comments. > > Thank you! > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 8233 bytes Desc: not available URL: From carsonhh at gmail.com Sun Apr 7 20:06:49 2019 From: carsonhh at gmail.com (Carson Holt) Date: Sun, 7 Apr 2019 19:06:49 -0600 Subject: [maker-devel] MAKER problem with gff3 file In-Reply-To: <2019040309213197334742@genetics.ac.cn> References: <2019040309213197334742@genetics.ac.cn> Message-ID: <961D15D1-36C0-4DD9-BE81-7C652A2C4CCF@gmail.com> The est2genome=1 one option in MAKER2 only works with input fasta files because it?s based on Exonerate?s est2genome alignments. It does not with with GFF3 input (gff3 is missing some things that are in the exonerate report). MAKER3 however will let you do this with GFF3 input (it goes back and tries to predict missing info that Exonerate would have produced). ?Carson > On Apr 2, 2019, at 7:21 PM, ychliu at genetics.ac.cn wrote: > > Dear MAKER developers, > I recently use the MAKER to do gene annotation. But even I use the gff3 file as the EST evidence, the result shows no gene that marked by est2genome (I do use the parameter est2genome=1). It may means that the gff3 seems doesn't work. So what's the problem? How can I solve it? Eager for you assistance. > Faithfully yours. > Yucheng Liu > > Yucheng Liu > Institute of Genetics and Developmental Biology, CAS > Beijing, 100101 China > Tel: 86-010-64801362 > E-mail: ychliu at genetics.ac.cn _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Sun Apr 7 20:08:54 2019 From: carsonhh at gmail.com (Carson Holt) Date: Sun, 7 Apr 2019 19:08:54 -0600 Subject: [maker-devel] Installation Failure with pg_config? In-Reply-To: References: Message-ID: <89174279-57D0-46D3-BE9D-FA03ED861227@gmail.com> DBD::Pg is optional. You should be able to say ?No? to the question on whether you want to install optional modules during the Build step. ?Carson > On Mar 14, 2019, at 4:24 PM, Shaowen Jiang wrote: > > Dear MAKER2 admins: > > Hi, I have read some tutorials for annotating a newly assembly genome and MAKER2 seems to be a very good and functional pipeline to me. So I am trying to use it to annotate a new assembly mammalian genome that our lab just generated. > But I was stuck while I was trying to install MAKER2 to our slurm HPC server. > I think the pipeline is trying to install several perl packages locally, but one of them called DBD::Pg requires the path of pg_config? > screenshot as below > > But I think our server doesn't have this path and I don't have root to install some other stuff, like libpq-dev or PostgreSQL. > Is that any other methods that can circle around that? > > Any help or advice would be appreciated! > > Thanks, > Shaowen > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Sun Apr 7 20:32:39 2019 From: carsonhh at gmail.com (Carson Holt) Date: Sun, 7 Apr 2019 19:32:39 -0600 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: Message-ID: That?s interesting. It could be a handful of internal filters that help with spurious results. I use a 0.5 sensitivity/specificity to identify shared edges for a jaccardian split on overlapping evidence clusters for example. There are also a couple of places where if the only thing supporting a model is a single exon blastx hit (i.e. no exonerate, ab initio model, or est splice support, but just a chunk od single exon blastx) then maker will use a reading frame aware AED value of 0.5 as a filter (as in it checks if the reading frame matches and not just raw overlap). If that?s the case, the spike near 0.5 may indicate I needed to be a little strickter than my empirical cutoff estimate. Perhaps 0.4 or 0.45 would be the better cuttoff for these spurious blastx induced models. ?Carson > On Apr 7, 2019, at 7:25 AM, Lior Glick wrote: > > Hi MAKER users, > Lately I've been performing annotations for multiple genomes from the same species. > When plotting the histogram of AED scores over all genes, I repeatedly see a very specific pattern, that looks something like this: > > This pattern is a bit surprising to me, in two aspects: > 1) Why is there a surge towards 0.5? > 2) Why is there a sudden drop right after that surge? > > Has anyone else seen this, or is this a specific outcome of my data/configuration? > Any ideas of what may cause such a distribution? > > While this is not necessarily an indication of a problem or bug, it does seem a bit odd, and might imply some bias or artifact. > Would appreciate your comments. > Thank you! > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From xvazquezc at gmail.com Sun Apr 7 23:42:15 2019 From: xvazquezc at gmail.com (=?UTF-8?Q?Xabier_V=C3=A1zquez=2DCampos?=) Date: Mon, 8 Apr 2019 14:42:15 +1000 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: Message-ID: If you train SNAP, the maker2zff script has internal quality cutoffs based on the existence of evidence. e.g. by default it will require having some EST evidence On Mon, 8 Apr 2019 at 11:32, Carson Holt wrote: > That?s interesting. It could be a handful of internal filters that help > with spurious results. > > I use a 0.5 sensitivity/specificity to identify shared edges for a > jaccardian split on overlapping evidence clusters for example. There are > also a couple of places where if the only thing supporting a model is a > single exon blastx hit (i.e. no exonerate, ab initio model, or est splice > support, but just a chunk od single exon blastx) then maker will use a > reading frame aware AED value of 0.5 as a filter (as in it checks if the > reading frame matches and not just raw overlap). If that?s the case, the > spike near 0.5 may indicate I needed to be a little strickter than my > empirical cutoff estimate. Perhaps 0.4 or 0.45 would be the better cuttoff > for these spurious blastx induced models. > > ?Carson > > > > On Apr 7, 2019, at 7:25 AM, Lior Glick wrote: > > > > Hi MAKER users, > > Lately I've been performing annotations for multiple genomes from the > same species. > > When plotting the histogram of AED scores over all genes, I repeatedly > see a very specific pattern, that looks something like this: > > > > This pattern is a bit surprising to me, in two aspects: > > 1) Why is there a surge towards 0.5? > > 2) Why is there a sudden drop right after that surge? > > > > Has anyone else seen this, or is this a specific outcome of my > data/configuration? > > Any ideas of what may cause such a distribution? > > > > While this is not necessarily an indication of a problem or bug, it does > seem a bit odd, and might imply some bias or artifact. > > Would appreciate your comments. > > Thank you! > > _______________________________________________ > > maker-devel mailing list > > maker-devel at box290.bluehost.com > > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -- Xabier V?zquez-Campos, *PhD* *Research Associate* NSW Systems Biology Initiative School of Biotechnology and Biomolecular Sciences The University of New South Wales Sydney NSW 2052 AUSTRALIA -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Apr 8 00:20:24 2019 From: carsonhh at gmail.com (Carson Holt) Date: Sun, 7 Apr 2019 23:20:24 -0600 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: Message-ID: <480A2430-C312-4A43-B659-4694B4F8E61A@gmail.com> Yes. maker2zff tries to further select a subset of the best supported models by requiring multiple forms of evidence support. ?Carson > On Apr 7, 2019, at 10:42 PM, Xabier V?zquez-Campos wrote: > > If you train SNAP, the maker2zff script has internal quality cutoffs based on the existence of evidence. e.g. by default it will require having some EST evidence > > On Mon, 8 Apr 2019 at 11:32, Carson Holt > wrote: > That?s interesting. It could be a handful of internal filters that help with spurious results. > > I use a 0.5 sensitivity/specificity to identify shared edges for a jaccardian split on overlapping evidence clusters for example. There are also a couple of places where if the only thing supporting a model is a single exon blastx hit (i.e. no exonerate, ab initio model, or est splice support, but just a chunk od single exon blastx) then maker will use a reading frame aware AED value of 0.5 as a filter (as in it checks if the reading frame matches and not just raw overlap). If that?s the case, the spike near 0.5 may indicate I needed to be a little strickter than my empirical cutoff estimate. Perhaps 0.4 or 0.45 would be the better cuttoff for these spurious blastx induced models. > > ?Carson > > > > On Apr 7, 2019, at 7:25 AM, Lior Glick > wrote: > > > > Hi MAKER users, > > Lately I've been performing annotations for multiple genomes from the same species. > > When plotting the histogram of AED scores over all genes, I repeatedly see a very specific pattern, that looks something like this: > > > > This pattern is a bit surprising to me, in two aspects: > > 1) Why is there a surge towards 0.5? > > 2) Why is there a sudden drop right after that surge? > > > > Has anyone else seen this, or is this a specific outcome of my data/configuration? > > Any ideas of what may cause such a distribution? > > > > While this is not necessarily an indication of a problem or bug, it does seem a bit odd, and might imply some bias or artifact. > > Would appreciate your comments. > > Thank you! > > _______________________________________________ > > maker-devel mailing list > > maker-devel at box290.bluehost.com > > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > -- > Xabier V?zquez-Campos, PhD > Research Associate > NSW Systems Biology Initiative > School of Biotechnology and Biomolecular Sciences > The University of New South Wales > Sydney NSW 2052 AUSTRALIA -------------- next part -------------- An HTML attachment was scrubbed... URL: From liorglic at mail.tau.ac.il Mon Apr 8 01:54:06 2019 From: liorglic at mail.tau.ac.il (Lior Glick) Date: Mon, 8 Apr 2019 09:54:06 +0300 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: <480A2430-C312-4A43-B659-4694B4F8E61A@gmail.com> References: <480A2430-C312-4A43-B659-4694B4F8E61A@gmail.com> Message-ID: Hello again and thank you all for your interesting answers. I mistakenly answered Mark yesterday from an unsubscribed mail, which resulted in only him getting it, so for documentation sake, I'm posting my answer here again, and Mark's reply: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Dear Mark, Thank you for the quick reply. I'm happy to see this ignites your interest and am willing to endure your punishing questions (; Before I answer them, I just want to make sure we're on the same page - as far as I understand, lower AED scores indicate higher agreement with the evidence, so the "good stuff" is actually left of the 0.5 surge. Am I correct? Otherwise, this is a very poor annotation... Now for the questions: 1) I did not make any filtrations so far, so single exon genes are included as well. in fact, I'm exploring the results in order to develop some criteria for filtering the genes. Would you suggest discarding single exon genes? 2) My evidence consist of assembled transcripts, proteins and predicted gene models (pred_gff). 3) As for repeats, I'm masking based on a repeats library obtained from a previous publication, specific to my organism of interest. Unfortunately, I didn't understand your final question. Could you please explain what you mean by "final build"? Hope these answers are helpful, and waiting to hear more thoughts. Thanks again. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *To which Mark replied:* Sorry. I?m dyslexic, especially early in the morning. Yes, good stuff is on the left. As regards single exon genes, that?s always a hard call, as these have a higher false positive rate. Things to consider are how prevalent are introns in your org? Cason can give more advice on this point, I?m sure. ? ? By ?"final build", I meant is this using the ?Standard build? or ?Max Build? protocol from PMC4286374? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Mark - well, as I said I haven't done any filtration yet, so I guess my annotation currently includes genes that would be discarded even with the "max build". I'll give this a try and look at the resulting distribution. Xabier - thanks, but I'm not using SNAP (just Augustus). Carson - I see a few fingers pointing in the direction of single-exon models, so maybe I should see what happens to the distribution of AED when these genes are removed. I'll get back to you with some more results. ??????? ??? ??, 8 ????? 2019 ?-8:20 ??? ?Carson Holt?? :? > Yes. maker2zff tries to further select a subset of the best supported > models by requiring multiple forms of evidence support. > > ?Carson > > > On Apr 7, 2019, at 10:42 PM, Xabier V?zquez-Campos > wrote: > > If you train SNAP, the maker2zff script has internal quality cutoffs based > on the existence of evidence. e.g. by default it will require having some > EST evidence > > On Mon, 8 Apr 2019 at 11:32, Carson Holt wrote: > >> That?s interesting. It could be a handful of internal filters that help >> with spurious results. >> >> I use a 0.5 sensitivity/specificity to identify shared edges for a >> jaccardian split on overlapping evidence clusters for example. There are >> also a couple of places where if the only thing supporting a model is a >> single exon blastx hit (i.e. no exonerate, ab initio model, or est splice >> support, but just a chunk od single exon blastx) then maker will use a >> reading frame aware AED value of 0.5 as a filter (as in it checks if the >> reading frame matches and not just raw overlap). If that?s the case, the >> spike near 0.5 may indicate I needed to be a little strickter than my >> empirical cutoff estimate. Perhaps 0.4 or 0.45 would be the better cuttoff >> for these spurious blastx induced models. >> >> ?Carson >> >> >> > On Apr 7, 2019, at 7:25 AM, Lior Glick wrote: >> > >> > Hi MAKER users, >> > Lately I've been performing annotations for multiple genomes from the >> same species. >> > When plotting the histogram of AED scores over all genes, I repeatedly >> see a very specific pattern, that looks something like this: >> > >> > This pattern is a bit surprising to me, in two aspects: >> > 1) Why is there a surge towards 0.5? >> > 2) Why is there a sudden drop right after that surge? >> > >> > Has anyone else seen this, or is this a specific outcome of my >> data/configuration? >> > Any ideas of what may cause such a distribution? >> > >> > While this is not necessarily an indication of a problem or bug, it >> does seem a bit odd, and might imply some bias or artifact. >> > Would appreciate your comments. >> > Thank you! >> > _______________________________________________ >> > maker-devel mailing list >> > maker-devel at box290.bluehost.com >> > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > > -- > Xabier V?zquez-Campos, *PhD* > *Research Associate* > NSW Systems Biology Initiative > School of Biotechnology and Biomolecular Sciences > The University of New South Wales > Sydney NSW 2052 AUSTRALIA > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From liorglic at mail.tau.ac.il Mon Apr 8 04:10:15 2019 From: liorglic at mail.tau.ac.il (Lior Glick) Date: Mon, 8 Apr 2019 12:10:15 +0300 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: <480A2430-C312-4A43-B659-4694B4F8E61A@gmail.com> Message-ID: Hi again - quick update: I made a plot comparing the histograms of single-exon genes to multi-exon genes: [image: newplot (5).png] It definitely looks like single-exon genes are *enriched* for the 0.5 score, but it does not account for the entire surge, as there also seem to be lots of multi-exon genes involved. This may suggest that the 0.5 peak is a result of multiple effects buried within the software. Any other thoughts/suggestions? Thanks again, -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: newplot (5).png Type: image/png Size: 18037 bytes Desc: not available URL: From carsonhh at gmail.com Mon Apr 8 11:48:42 2019 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 8 Apr 2019 10:48:42 -0600 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: <480A2430-C312-4A43-B659-4694B4F8E61A@gmail.com> Message-ID: One note. When I say single exon blastx hit, I mean that the evidence is single exon, not that the gene model is single exon. What I think you are seeing is an effect that seems to be partially related to under-masking, i.e. a spurious partial blastx alignment to a low complexity repeat (which is why the blastx protein alignment refuses to polish with exonerate). That is why the filter was added. So if a model (single or multi-exon) has no additional ab initio prediction support, has no EST support, and has no exonerate polished protein support, but does have a single-exon/single-hsp blastx overlap it gets filtered out at 0.5 (that threshold based on trial and error on a couple of genomes where we saw this occur - but your graph suggests that filter might be too loose and 0.4 or 0.45 might be a better value). So the spike is caused by poor blastx and under-masking (this may be explained if your are using in pred_gff models that were generated on an unmasked assembly outside of MAKER), then the drop around 0.5 is caused by MAKER filtering out models only supported by what appears to be spuious blastx alignments. ?Carson > On Apr 8, 2019, at 3:10 AM, Lior Glick wrote: > > Hi again - quick update: > I made a plot comparing the histograms of single-exon genes to multi-exon genes: > > It definitely looks like single-exon genes are enriched for the 0.5 score, but it does not account for the entire surge, as there also seem to be lots of multi-exon genes involved. This may suggest that the 0.5 peak is a result of multiple effects buried within the software. > Any other thoughts/suggestions? > > Thanks again, > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Apr 8 11:51:55 2019 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 8 Apr 2019 10:51:55 -0600 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: <480A2430-C312-4A43-B659-4694B4F8E61A@gmail.com> Message-ID: <75B7E2C9-2D1B-452F-BEED-704289C881ED@gmail.com> Try also adding 2 exon models to the graph. It would be interesting to see if these are attempted single-exon models where the predictor added a micro-intron to keep the open reading frame going against a single exon blastx hint. ?Carson > On Apr 8, 2019, at 3:10 AM, Lior Glick wrote: > > Hi again - quick update: > I made a plot comparing the histograms of single-exon genes to multi-exon genes: > > It definitely looks like single-exon genes are enriched for the 0.5 score, but it does not account for the entire surge, as there also seem to be lots of multi-exon genes involved. This may suggest that the 0.5 peak is a result of multiple effects buried within the software. > Any other thoughts/suggestions? > > Thanks again, > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ying.hu at ufl.edu Wed Apr 17 10:20:00 2019 From: ying.hu at ufl.edu (Hu,Ying) Date: Wed, 17 Apr 2019 15:20:00 +0000 Subject: [maker-devel] maker exons number Message-ID: Hi, Carson, I am using MAKER 2.31.6 to annotate a genome. I noticed that exon number in each gene does not start from 1. Can you give me some suggestions how to change the exon number to 1,2,3 .. In each gene? Thansks, Ying Here are some examples: tig00000226|arrow . contig 1 43850 . . . ID=tig00000226|arrow;Name=tig00000226|arrow tig00000226|arrow maker gene 26339 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0;Name=augustus_masked-tig00000226|arrow-processed-gene-0.0 tig00000226|arrow maker mRNA 26339 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0;Name=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1;_AED=0.47;_eAED=0.68;_QI=0|0|0|0.75|1|1|4|0|82 tig00000226|arrow maker exon 26339 26353 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:58;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker exon 27490 27636 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:59;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker exon 27738 27808 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:60;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker exon 27900 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:61;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 26339 26353 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 27490 27636 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 27738 27808 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 27900 27915 . + 1 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker gene 5803 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1;Name=maker-tig00000226|arrow-augustus-gene-0.1 tig00000226|arrow maker mRNA 5803 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1;Parent=maker-tig00000226|arrow-augustus-gene-0.1;Name=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1;_AED=0.63;_eAED=0.69;_QI=0|0|0|1|0|0|2|0|85 tig00000226|arrow maker exon 5803 5975 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:exon:62;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00000226|arrow maker exon 6505 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:exon:63;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00000226|arrow maker CDS 5803 5975 . + 0 ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00000226|arrow maker CDS 6505 6589 . + 1 ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow . contig 1 104941 . . . ID=tig00034405|arrow;Name=tig00034405|arrow tig00034405|arrow maker gene 40927 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0;Name=maker-tig00034405|arrow-augustus-gene-0.0 tig00034405|arrow maker mRNA 40927 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1;Parent=maker-tig00034405|arrow-augustus-gene-0.0;Name=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1;_AED=0.04;_eAED=0.04;_QI=266|1|1|1|0|0|3|0|100 tig00034405|arrow maker exon 40927 41273 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7157;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker exon 41476 41622 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7158;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker exon 50954 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7159;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker five_prime_UTR 40927 41192 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:five_prime_utr;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker CDS 41193 41273 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker CDS 41476 41622 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker CDS 50954 51025 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker gene 57931 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2;Name=augustus_masked-tig00034405|arrow-processed-gene-0.2 tig00034405|arrow maker mRNA 57931 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2;Name=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1;_AED=0.13;_eAED=0.13;_QI=0|0.5|0|0.66|1|1|3|0|522 tig00034405|arrow maker exon 57931 58962 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7160;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker exon 59245 59725 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7161;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker exon 61510 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7162;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker CDS 57931 58962 . + 0 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker CDS 59245 59725 . + 0 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker CDS 61510 61565 . + 2 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker gene 90355 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1;Name=maker-tig00034405|arrow-augustus-gene-0.1 tig00034405|arrow maker mRNA 90355 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1;Parent=maker-tig00034405|arrow-augustus-gene-0.1;Name=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1;_AED=0.28;_eAED=0.28;_QI=0|0.7|0.72|1|1|1|11|386|425 tig00034405|arrow maker exon 90355 90911 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7173;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 91011 91086 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7172;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 91179 91240 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7171;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 91557 91706 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7170;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 92996 93064 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7169;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93156 93347 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7168;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93453 93637 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7167;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93711 93786 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7166;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93866 93972 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7165;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 94536 94573 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7164;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 94645 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7163;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 94645 94796 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 94536 94573 . - 1 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93866 93972 . - 2 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93711 93786 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93453 93637 . - 2 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93156 93347 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 92996 93064 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 91557 91706 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 91179 91240 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 91011 91086 . - 1 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 90741 90911 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker three_prime_UTR 90355 90740 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:three_prime_utr;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From huyingwin at gmail.com Wed Apr 17 10:23:10 2019 From: huyingwin at gmail.com (YING HU) Date: Wed, 17 Apr 2019 11:23:10 -0400 Subject: [maker-devel] maker exon number Message-ID: Hi, Carson, I am using MAKER 2.31.6 to annotate a genome. I noticed that exon number in each gene does not start from 1. Can you give me some suggestions how to change the exon number to 1,2,3 .. In each gene? Thansks, Ying Here are some examples: tig00000226|arrow . contig 1 43850 . . . ID=tig00000226|arrow;Name=tig00000226|arrow tig00000226|arrow maker gene 26339 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0;Name=augustus_masked-tig00000226|arrow-processed-gene-0.0 tig00000226|arrow maker mRNA 26339 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0;Name=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1;_AED=0.47;_eAED=0.68;_QI=0|0|0|0.75|1|1|4|0|82 tig00000226|arrow maker exon 26339 26353 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:58;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker exon 27490 27636 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:59;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker exon 27738 27808 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:60;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker exon 27900 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:61;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 26339 26353 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 27490 27636 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 27738 27808 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 27900 27915 . + 1 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker gene 5803 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1;Name=maker-tig00000226|arrow-augustus-gene-0.1 tig00000226|arrow maker mRNA 5803 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1;Parent=maker-tig00000226|arrow-augustus-gene-0.1;Name=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1;_AED=0.63;_eAED=0.69;_QI=0|0|0|1|0|0|2|0|85 tig00000226|arrow maker exon 5803 5975 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:exon:62;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00000226|arrow maker exon 6505 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:exon:63;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00000226|arrow maker CDS 5803 5975 . + 0 ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00000226|arrow maker CDS 6505 6589 . + 1 ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow . contig 1 104941 . . . ID=tig00034405|arrow;Name=tig00034405|arrow tig00034405|arrow maker gene 40927 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0;Name=maker-tig00034405|arrow-augustus-gene-0.0 tig00034405|arrow maker mRNA 40927 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1;Parent=maker-tig00034405|arrow-augustus-gene-0.0;Name=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1;_AED=0.04;_eAED=0.04;_QI=266|1|1|1|0|0|3|0|100 tig00034405|arrow maker exon 40927 41273 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7157;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker exon 41476 41622 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7158;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker exon 50954 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7159;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker five_prime_UTR 40927 41192 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:five_prime_utr;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker CDS 41193 41273 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker CDS 41476 41622 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker CDS 50954 51025 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker gene 57931 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2;Name=augustus_masked-tig00034405|arrow-processed-gene-0.2 tig00034405|arrow maker mRNA 57931 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2;Name=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1;_AED=0.13;_eAED=0.13;_QI=0|0.5|0|0.66|1|1|3|0|522 tig00034405|arrow maker exon 57931 58962 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7160;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker exon 59245 59725 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7161;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker exon 61510 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7162;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker CDS 57931 58962 . + 0 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker CDS 59245 59725 . + 0 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker CDS 61510 61565 . + 2 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker gene 90355 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1;Name=maker-tig00034405|arrow-augustus-gene-0.1 tig00034405|arrow maker mRNA 90355 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1;Parent=maker-tig00034405|arrow-augustus-gene-0.1;Name=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1;_AED=0.28;_eAED=0.28;_QI=0|0.7|0.72|1|1|1|11|386|425 tig00034405|arrow maker exon 90355 90911 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7173;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 91011 91086 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7172;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 91179 91240 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7171;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 91557 91706 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7170;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 92996 93064 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7169;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93156 93347 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7168;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93453 93637 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7167;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93711 93786 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7166;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93866 93972 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7165;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 94536 94573 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7164;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 94645 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7163;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 94645 94796 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 94536 94573 . - 1 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93866 93972 . - 2 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93711 93786 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93453 93637 . - 2 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93156 93347 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 92996 93064 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 91557 91706 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 91179 91240 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 91011 91086 . - 1 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 90741 90911 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker three_prime_UTR 90355 90740 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:three_prime_utr;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Apr 17 14:43:41 2019 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 17 Apr 2019 13:43:41 -0600 Subject: [maker-devel] maker exon number In-Reply-To: References: Message-ID: The ID= value is simply a unique value to resolve inheritance in conjunction with Parent=. It has no biological meaning. Also with gff3 format, to reduce redundancy, a single ?exon? features can be the child of multiple mRNA features, so a single ?exon' line can be the first exon in one transcript but the second exon in another. ?Carson > On Apr 17, 2019, at 9:23 AM, YING HU wrote: > > Hi, Carson, > > I am using MAKER 2.31.6 to annotate a genome. I noticed that exon number in each gene does not start from 1. Can you give me some suggestions how to change the exon number to 1,2,3 .. In each gene? Thansks, > > Ying > > Here are some examples: > > tig00000226|arrow . contig 1 43850 . . . ID=tig00000226|arrow;Name=tig00000226|arrow > tig00000226|arrow maker gene 26339 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0;Name=augustus_masked-tig00000226|arrow-processed-gene-0.0 > tig00000226|arrow maker mRNA 26339 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0;Name=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1;_AED=0.47;_eAED=0.68;_QI=0|0|0|0.75|1|1|4|0|82 > tig00000226|arrow maker exon 26339 26353 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:58;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker exon 27490 27636 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:59;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker exon 27738 27808 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:60;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker exon 27900 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:61;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker CDS 26339 26353 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker CDS 27490 27636 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker CDS 27738 27808 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker CDS 27900 27915 . + 1 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker gene 5803 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1;Name=maker-tig00000226|arrow-augustus-gene-0.1 > tig00000226|arrow maker mRNA 5803 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1;Parent=maker-tig00000226|arrow-augustus-gene-0.1;Name=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1;_AED=0.63;_eAED=0.69;_QI=0|0|0|1|0|0|2|0|85 > tig00000226|arrow maker exon 5803 5975 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:exon:62;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 > tig00000226|arrow maker exon 6505 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:exon:63;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 > tig00000226|arrow maker CDS 5803 5975 . + 0 ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 > > tig00000226|arrow maker CDS 6505 6589 . + 1 ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 > > tig00034405|arrow . contig 1 104941 . . . ID=tig00034405|arrow;Name=tig00034405|arrow > tig00034405|arrow maker gene 40927 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0;Name=maker-tig00034405|arrow-augustus-gene-0.0 > tig00034405|arrow maker mRNA 40927 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1;Parent=maker-tig00034405|arrow-augustus-gene-0.0;Name=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1;_AED=0.04;_eAED=0.04;_QI=266|1|1|1|0|0|3|0|100 > tig00034405|arrow maker exon 40927 41273 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7157;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 > tig00034405|arrow maker exon 41476 41622 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7158;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 > tig00034405|arrow maker exon 50954 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7159;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 > tig00034405|arrow maker five_prime_UTR 40927 41192 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:five_prime_utr;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 > tig00034405|arrow maker CDS 41193 41273 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 > tig00034405|arrow maker CDS 41476 41622 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 > tig00034405|arrow maker CDS 50954 51025 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 > tig00034405|arrow maker gene 57931 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2;Name=augustus_masked-tig00034405|arrow-processed-gene-0.2 > tig00034405|arrow maker mRNA 57931 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2;Name=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1;_AED=0.13;_eAED=0.13;_QI=0|0.5|0|0.66|1|1|3|0|522 > tig00034405|arrow maker exon 57931 58962 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7160;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 > tig00034405|arrow maker exon 59245 59725 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7161;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 > tig00034405|arrow maker exon 61510 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7162;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 > tig00034405|arrow maker CDS 57931 58962 . + 0 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 > tig00034405|arrow maker CDS 59245 59725 . + 0 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 > tig00034405|arrow maker CDS 61510 61565 . + 2 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 > tig00034405|arrow maker gene 90355 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1;Name=maker-tig00034405|arrow-augustus-gene-0.1 > tig00034405|arrow maker mRNA 90355 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1;Parent=maker-tig00034405|arrow-augustus-gene-0.1;Name=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1;_AED=0.28;_eAED=0.28;_QI=0|0.7|0.72|1|1|1|11|386|425 > tig00034405|arrow maker exon 90355 90911 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7173;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 91011 91086 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7172;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 91179 91240 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7171;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 91557 91706 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7170;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 92996 93064 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7169;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 93156 93347 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7168;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 93453 93637 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7167;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 93711 93786 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7166;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 93866 93972 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7165;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 94536 94573 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7164;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 94645 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7163;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 94645 94796 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 94536 94573 . - 1 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 93866 93972 . - 2 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 93711 93786 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 93453 93637 . - 2 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 93156 93347 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 92996 93064 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 91557 91706 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 91179 91240 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 91011 91086 . - 1 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 90741 90911 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > > tig00034405|arrow maker three_prime_UTR 90355 90740 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:three_prime_utr;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul at tupac.bio Thu Apr 18 04:23:35 2019 From: paul at tupac.bio (Paul Sheridan) Date: Thu, 18 Apr 2019 18:23:35 +0900 Subject: [maker-devel] maker_functional_gff Error Message-ID: Dear MAKER Team, I am running MAKER 2.31.10 a 32 core instance. I followed the Post Processing of Annotations steps as described in the MAKER Tutorial for GMOD Online Training 2014 as best I could, but I get an error when I run maker_functional_gff. The commands in the order of execution and relevant output are shown below. Where did I do wrong? # run blastp command blastp -query genome.all.maker.proteins.fasta -db uniprot_sprot.fasta -num_threads 32 -evalue 1e-6 -max_hsps 1 -max_target_seqs 1 -outfmt 6 -out output.blastp # run interproscan command interproscan.sh -appl pfam -dp -f TSV -goterms -iprlookup -pa -t p -i genome.all.maker.proteins.fasta -o output.iprscan # create naming table maker_map_ids --prefix ThuMac --justify 5 genome.all.gff > genome.all.map # copy files for safe keeping cp genome.all.gff genome.all.renamed.gff cp genome.all.noseq.gff genome.all.noseq.renamed.gff cp genome.all.maker.proteins.fasta genome.all.maker.proteins.renamed.fasta cp genome.all.maker.proteins.aed.0.50.fasta genome.all.maker.proteins.aed.0.50.renamed.fasta cp genome.all.maker.unique.proteins.aed.0.50.fasta genome.all.maker.unique.proteins.aed.0.50.renamed.fasta cp genome.all.maker.transcripts.fasta genome.all.maker.transcripts.renamed.fasta cp genome.all.maker.transcripts.aed.0.50.fasta genome.all.maker.transcripts.aed.0.50.renamed.fasta cp output.iprscan output.renamed.iprscan cp output.blastp output.renamed.blastp # replace uninformative MAKER protein/transcript names with useful ones map_gff_ids genome.all.map genome.all.renamed.gff map_gff_ids genome.all.map genome.all.noseq.renamed.gff map_fasta_ids genome.all.map genome.all.maker.proteins.renamed.fasta map_fasta_ids genome.all.map genome.all.maker.proteins.aed.0.50.renamed.fasta map_fasta_ids genome.all.map genome.all.maker.unique.proteins.aed.0.50.renamed.fasta map_fasta_ids genome.all.map genome.all.maker.transcripts.renamed.fasta map_fasta_ids genome.all.map genome.all.maker.transcripts.aed.0.50.renamed.fasta map_data_ids genome.all.map output.renamed.iprscan map_data_ids genome.all.map output.renamed.blastp # assign annotations maker_functional_gff uniprot_sprot.db output.renamed.blastp genome.all.renamed.gff > genome.all.renamed.putative_function.gff > head output.renamed.blastp ThuMac30929-RA P20036 41.791 134 77 1 326 458 113 246 9.51e-28 114 ThuMac19623-RA P81018 35.714 168 87 2 1 147 1 168 8.40e-33 117 ThuMac19629-RA Q66I51 68.939 264 79 2 1 263 1 262 1.48e-130 372 ThuMac19628-RA Q61464 55.172 87 37 1 766 852 382 466 4.42e-25 119 ThuMac19627-RA P07898 48.276 58 29 1 13 69 1962 2019 3.60e-13 65.9 ThuMac19626-RA P81018 36.782 174 96 2 21 180 1 174 5.75e-36 127 ThuMac19624-RA P81018 35.057 174 99 2 21 180 1 174 2.19e-33 120 ThuMac19625-RA Q28343 32.520 123 43 2 35 117 2123 2245 7.57e-17 78.6 ThuMac19636-RA Q9QX29 90.909 110 10 0 5 114 458 567 6.45e-65 216 ThuMac19638-RA Q9QX29 57.391 115 35 3 5 114 703 808 3.06e-28 120 > head output.renamed.iprscan ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF00520 Ion transport protein 154 413 3.8E-21 T 18-04-2019 IPR005821 Ion transport domain GO:0005216|GO:0006811|GO:0016020|GO:0055085 ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF08412 Ion transport protein N-terminal 109 152 5.1E-19 T 18-04-2019 IPR013621 Ion transport N-terminal Reactome: R-HSA-1296061 ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF00027 Cyclic nucleotide-binding domain 519 601 1.0E-17 T 18-04-2019 IPR000595 Cyclic nucleotide-binding domain ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF13765 SPRY-associated domain 235 283 8.9E-23 T 18-04-2019 IPR006574 SPRY-associated ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF00643 B-box zinc finger 18 56 5.2E-12 T 18-04-2019 IPR000315 B-box-type zinc finger GO:0008270 ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF00622 SPRY domain 287 391 2.2E-14 T 18-04-2019 IPR003877 SPRY domain GO:0005515 ThuMac08369-RA 7aee1da5a47975ab8e43b68bfd1a117c 139 Pfam PF00076 RNA recognition motif. (a.k.a. RRM, RBD, or RNP domain) 22 87 1.6E-15 T 18-04-2019 IPR000504 RNA recognition motif domain GO:0003676 ThuMac26054-RA 8f4119609312bd6442f8bb094c104231 462 Pfam PF07565 Band 3 cytoplasmic domain 173 443 7.3E-100 T 18-04-2019 IPR013769 Band 3 cytoplasmic domain GO:0006820|GO:0008509|GO:0016021 Reactome: R-HSA-425381 ThuMac07958-RA d2b749fa573a5e452cadee56090c9588 804 Pfam PF03372 Endonuclease/Exonuclease/phosphatase family 235 535 7.0E-11 T 18-04-2019 IPR005135 Endonuclease/exonuclease/phosphatase ThuMac07958-RA d2b749fa573a5e452cadee56090c9588 804 Pfam PF17751 SKICH domain 555 649 9.8E-23 T 18-04-2019 IPR041611 SKICH domain > map_data_ids genome.all.map output.renamed.iprscan WARNING: No mapping available for ThuMac01937-RA WARNING: No mapping available for ThuMac02226-RA WARNING: No mapping available for ThuMac20730-RA WARNING: No mapping available for ThuMac20730-RA WARNING: No mapping available for ThuMac14750-RA (Thousands of warnings like these were returned) > maker_functional_gff uniprot_sprot.db output.renamed.blastp genome.all.renamed.gff > genome.all.renamed.putative_function.gff Can't use string ("") as a HASH ref while "strict refs" in use at /root/maker/bin/maker_functional_gff line 55, <$IN> line 3. > head genome.all.renamed.putative_function.gff ##gff-version 3 scf7180000008677_pilon_pilon . contig 1 49996 . . . ID=scf7180000008677_pilon_pilon;Name=scf7180000008677_pilon_pilon Thanks in Advance, Paul Sheridan -- CSO at Tupac Bio Email: paul at tupac.bio Homepage: www.paulsheridan.net Mobile: +81 80 7889 0859 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Apr 22 12:50:27 2019 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 22 Apr 2019 11:50:27 -0600 Subject: [maker-devel] maker_functional_gff Error In-Reply-To: References: Message-ID: <4AE21A4F-77F5-4DD8-8614-0D037F7C5209@gmail.com> This ?WARNING: No mapping available for ThuMac01937-RA? means you are running on a file that already has been renamed. The file will have names like maker-SDFGDG-gene-0.1-mRNA-1 for example, and it?s finding the name ThuMac01937-RA, which is not in the first column of the map file. So it throws a warning. The second one ?> Can't use string ("") as a HASH ref while "strict refs" in use at /root/maker/bin/maker_functional_gff line 55, <$IN> line 3. You likely have a trucated line in the GFF3. It?s missing an ID= tag. This can sometimes happen if writing to network mounted (NFS) file systems because of an ansyncrounous IO error. NFS file systems have a performance enhancement where they return SUCCESS on IO operations even and then complete the IO operation later in the background. This improves speed by letting the program advance by not blocking for the IO operation, but it reduces reliability because if the later operation is not really successful, it can?t go back and tell the program ?never mind it failed.? The result is a silent truncation of data. Not super common, but not all that rare either depending on IO load (i.e. heavy MPI with lots of writes). Find the line that?s truncated, then rerun just that contig before building the merged gff3 for everything. ?Carson > On Apr 18, 2019, at 3:23 AM, Paul Sheridan wrote: > > Dear MAKER Team, > > I am running MAKER 2.31.10 a 32 core instance. I followed the Post Processing of Annotations steps as described in the MAKER Tutorial for GMOD Online Training 2014 as best I could, but I get an error when I run maker_functional_gff. The commands in the order of execution and relevant output are shown below. > > Where did I do wrong? > > # run blastp command > blastp -query genome.all.maker.proteins.fasta -db uniprot_sprot.fasta -num_threads 32 -evalue 1e-6 -max_hsps 1 -max_target_seqs 1 -outfmt 6 -out output.blastp > > # run interproscan command > interproscan.sh -appl pfam -dp -f TSV -goterms -iprlookup -pa -t p -i genome.all.maker.proteins.fasta -o output.iprscan > > # create naming table > maker_map_ids --prefix ThuMac --justify 5 genome.all.gff > genome.all.map > > # copy files for safe keeping > cp genome.all.gff genome.all.renamed.gff > cp genome.all.noseq.gff genome.all.noseq.renamed.gff > cp genome.all.maker.proteins.fasta genome.all.maker.proteins.renamed.fasta > cp genome.all.maker.proteins.aed.0.50.fasta genome.all.maker.proteins.aed.0.50.renamed.fasta > cp genome.all.maker.unique.proteins.aed.0.50.fasta genome.all.maker.unique.proteins.aed.0.50.renamed.fasta > cp genome.all.maker.transcripts.fasta genome.all.maker.transcripts.renamed.fasta > cp genome.all.maker.transcripts.aed.0.50.fasta genome.all.maker.transcripts.aed.0.50.renamed.fasta > cp output.iprscan output.renamed.iprscan > cp output.blastp output.renamed.blastp > > # replace uninformative MAKER protein/transcript names with useful ones > map_gff_ids genome.all.map genome.all.renamed.gff > map_gff_ids genome.all.map genome.all.noseq.renamed.gff > map_fasta_ids genome.all.map genome.all.maker.proteins.renamed.fasta > map_fasta_ids genome.all.map genome.all.maker.proteins.aed.0.50.renamed.fasta > map_fasta_ids genome.all.map genome.all.maker.unique.proteins.aed.0.50.renamed.fasta > map_fasta_ids genome.all.map genome.all.maker.transcripts.renamed.fasta > map_fasta_ids genome.all.map genome.all.maker.transcripts.aed.0.50.renamed.fasta > map_data_ids genome.all.map output.renamed.iprscan > map_data_ids genome.all.map output.renamed.blastp > > # assign annotations > maker_functional_gff uniprot_sprot.db output.renamed.blastp genome.all.renamed.gff > genome.all.renamed.putative_function.gff > > > head output.renamed.blastp > ThuMac30929-RA P20036 41.791 134 77 1 326 458 113 246 9.51e-28 114 > ThuMac19623-RA P81018 35.714 168 87 2 1 147 1 168 8.40e-33 117 > ThuMac19629-RA Q66I51 68.939 264 79 2 1 263 1 262 1.48e-130 372 > ThuMac19628-RA Q61464 55.172 87 37 1 766 852 382 466 4.42e-25 119 > ThuMac19627-RA P07898 48.276 58 29 1 13 69 1962 2019 3.60e-13 65.9 > ThuMac19626-RA P81018 36.782 174 96 2 21 180 1 174 5.75e-36 127 > ThuMac19624-RA P81018 35.057 174 99 2 21 180 1 174 2.19e-33 120 > ThuMac19625-RA Q28343 32.520 123 43 2 35 117 2123 2245 7.57e-17 78.6 > ThuMac19636-RA Q9QX29 90.909 110 10 0 5 114 458 567 6.45e-65 216 > ThuMac19638-RA Q9QX29 57.391 115 35 3 5 114 703 808 3.06e-28 120 > > > head output.renamed.iprscan > ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF00520 Ion transport protein 154 413 3.8E-21 T 18-04-2019 IPR005821 Ion transport domain GO:0005216|GO:0006811|GO:0016020|GO:0055085 > ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF08412 Ion transport protein N-terminal 109 152 5.1E-19 T 18-04-2019 IPR013621 Ion transport N-terminal Reactome: R-HSA-1296061 > ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF00027 Cyclic nucleotide-binding domain 519 601 1.0E-17 T 18-04-2019 IPR000595 Cyclic nucleotide-binding domain > ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF13765 SPRY-associated domain 235 283 8.9E-23 T 18-04-2019 IPR006574 SPRY-associated > ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF00643 B-box zinc finger 18 56 5.2E-12 T 18-04-2019 IPR000315 B-box-type zinc finger GO:0008270 > ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF00622 SPRY domain 287 391 2.2E-14 T 18-04-2019 IPR003877 SPRY domain GO:0005515 > ThuMac08369-RA 7aee1da5a47975ab8e43b68bfd1a117c 139 Pfam PF00076 RNA recognition motif. (a.k.a. RRM, RBD, or RNP domain) 22 87 1.6E-15 T 18-04-2019 IPR000504 RNA recognition motif domain GO:0003676 > ThuMac26054-RA 8f4119609312bd6442f8bb094c104231 462 Pfam PF07565 Band 3 cytoplasmic domain 173 443 7.3E-100 T 18-04-2019 IPR013769 Band 3 cytoplasmic domain GO:0006820|GO:0008509|GO:0016021 Reactome: R-HSA-425381 > ThuMac07958-RA d2b749fa573a5e452cadee56090c9588 804 Pfam PF03372 Endonuclease/Exonuclease/phosphatase family 235 535 7.0E-11 T 18-04-2019 IPR005135 Endonuclease/exonuclease/phosphatase > ThuMac07958-RA d2b749fa573a5e452cadee56090c9588 804 Pfam PF17751 SKICH domain 555 649 9.8E-23 T 18-04-2019 IPR041611 SKICH domain > > > map_data_ids genome.all.map output.renamed.iprscan > WARNING: No mapping available for ThuMac01937-RA > WARNING: No mapping available for ThuMac02226-RA > WARNING: No mapping available for ThuMac20730-RA > WARNING: No mapping available for ThuMac20730-RA > WARNING: No mapping available for ThuMac14750-RA > (Thousands of warnings like these were returned) > > > maker_functional_gff uniprot_sprot.db output.renamed.blastp genome.all.renamed.gff > genome.all.renamed.putative_function.gff > Can't use string ("") as a HASH ref while "strict refs" in use at /root/maker/bin/maker_functional_gff line 55, <$IN> line 3. > > > head genome.all.renamed.putative_function.gff > ##gff-version 3 > scf7180000008677_pilon_pilon . contig 1 49996 . . . ID=scf7180000008677_pilon_pilon;Name=scf7180000008677_pilon_pilon > > Thanks in Advance, > > Paul Sheridan > > -- > CSO at Tupac Bio > Email: paul at tupac.bio > Homepage: www.paulsheridan.net > Mobile: +81 80 7889 0859 > _______________________________________________ > maker-devel mailing list > maker-devel at yandell-lab.org > http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul at tupac.bio Sun Apr 28 20:40:12 2019 From: paul at tupac.bio (Paul Sheridan) Date: Mon, 29 Apr 2019 10:40:12 +0900 Subject: [maker-devel] maker_functional_gff Error In-Reply-To: <4AE21A4F-77F5-4DD8-8614-0D037F7C5209@gmail.com> References: <4AE21A4F-77F5-4DD8-8614-0D037F7C5209@gmail.com> Message-ID: Hi Carson, Thanks, your suggestions got me sorted out. Best, Paul On Tue, Apr 23, 2019 at 2:50 AM Carson Holt wrote: > This ?WARNING: No mapping available for ThuMac01937-RA? means you are > running on a file that already has been renamed. The file will have names > like maker-SDFGDG-gene-0.1-mRNA-1 for example, and it?s finding the name > ThuMac01937-RA, which is not in the first column of the map file. So it > throws a warning. > > The second one ?> Can't use string ("") as a HASH ref while "strict refs" > in use at /root/maker/bin/maker_functional_gff line 55, <$IN> line 3. > > > You likely have a trucated line in the GFF3. It?s missing an ID= tag. This > can sometimes happen if writing to network mounted (NFS) file systems > because of an ansyncrounous IO error. NFS file systems have a performance > enhancement where they return SUCCESS on IO operations even and then > complete the IO operation later in the background. This improves speed by > letting the program advance by not blocking for the IO operation, but it > reduces reliability because if the later operation is not really > successful, it can?t go back and tell the program ?never mind it failed.? > The result is a silent truncation of data. Not super common, but not all > that rare either depending on IO load (i.e. heavy MPI with lots of writes). > Find the line that?s truncated, then rerun just that contig before building > the merged gff3 for everything. > > ?Carson > > > > On Apr 18, 2019, at 3:23 AM, Paul Sheridan wrote: > > Dear MAKER Team, > > I am running MAKER 2.31.10 a 32 core instance. I followed the Post > Processing of Annotations steps as described in the MAKER Tutorial for GMOD > Online Training 2014 as best I could, but I get an error when I run > maker_functional_gff. The commands in the order of execution and relevant > output are shown below. > > Where did I do wrong? > > # run blastp command > blastp -query genome.all.maker.proteins.fasta -db uniprot_sprot.fasta > -num_threads 32 -evalue 1e-6 -max_hsps 1 -max_target_seqs 1 -outfmt 6 -out > output.blastp > > # run interproscan command > interproscan.sh -appl pfam -dp -f TSV -goterms -iprlookup -pa -t p -i > genome.all.maker.proteins.fasta -o output.iprscan > > # create naming table > maker_map_ids --prefix ThuMac --justify 5 genome.all.gff > genome.all.map > > # copy files for safe keeping > cp genome.all.gff genome.all.renamed.gff > cp genome.all.noseq.gff genome.all.noseq.renamed.gff > cp genome.all.maker.proteins.fasta genome.all.maker.proteins.renamed.fasta > cp genome.all.maker.proteins.aed.0.50.fasta > genome.all.maker.proteins.aed.0.50.renamed.fasta > cp genome.all.maker.unique.proteins.aed.0.50.fasta > genome.all.maker.unique.proteins.aed.0.50.renamed.fasta > cp genome.all.maker.transcripts.fasta > genome.all.maker.transcripts.renamed.fasta > cp genome.all.maker.transcripts.aed.0.50.fasta > genome.all.maker.transcripts.aed.0.50.renamed.fasta > cp output.iprscan output.renamed.iprscan > cp output.blastp output.renamed.blastp > > # replace uninformative MAKER protein/transcript names with useful ones > map_gff_ids genome.all.map genome.all.renamed.gff > map_gff_ids genome.all.map genome.all.noseq.renamed.gff > map_fasta_ids genome.all.map genome.all.maker.proteins.renamed.fasta > map_fasta_ids genome.all.map > genome.all.maker.proteins.aed.0.50.renamed.fasta > map_fasta_ids genome.all.map > genome.all.maker.unique.proteins.aed.0.50.renamed.fasta > map_fasta_ids genome.all.map genome.all.maker.transcripts.renamed.fasta > map_fasta_ids genome.all.map > genome.all.maker.transcripts.aed.0.50.renamed.fasta > map_data_ids genome.all.map output.renamed.iprscan > map_data_ids genome.all.map output.renamed.blastp > > # assign annotations > maker_functional_gff uniprot_sprot.db output.renamed.blastp > genome.all.renamed.gff > genome.all.renamed.putative_function.gff > > > head output.renamed.blastp > ThuMac30929-RA P20036 41.791 134 77 1 326 458 113 246 9.51e-28 114 > ThuMac19623-RA P81018 35.714 168 87 2 1 147 1 168 8.40e-33 117 > ThuMac19629-RA Q66I51 68.939 264 79 2 1 263 1 262 1.48e-130 372 > ThuMac19628-RA Q61464 55.172 87 37 1 766 852 382 466 4.42e-25 119 > ThuMac19627-RA P07898 48.276 58 29 1 13 69 1962 2019 3.60e-13 65.9 > ThuMac19626-RA P81018 36.782 174 96 2 21 180 1 174 5.75e-36 127 > ThuMac19624-RA P81018 35.057 174 99 2 21 180 1 174 2.19e-33 120 > ThuMac19625-RA Q28343 32.520 123 43 2 35 117 2123 2245 7.57e-17 78.6 > ThuMac19636-RA Q9QX29 90.909 110 10 0 5 114 458 567 6.45e-65 216 > ThuMac19638-RA Q9QX29 57.391 115 35 3 5 114 703 808 3.06e-28 120 > > > head output.renamed.iprscan > ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF00520 Ion > transport protein 154 413 3.8E-21 T 18-04-2019 IPR005821 Ion transport > domain GO:0005216|GO:0006811|GO:0016020|GO:0055085 > ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF08412 Ion > transport protein N-terminal 109 152 5.1E-19 T 18-04-2019 IPR013621 Ion > transport N-terminal Reactome: R-HSA-1296061 > ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF00027 Cyclic > nucleotide-binding domain 519 601 1.0E-17 T 18-04-2019 IPR000595 Cyclic > nucleotide-binding domain > ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF13765 SPRY-associated > domain 235 283 8.9E-23 T 18-04-2019 IPR006574 SPRY-associated > ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF00643 B-box > zinc finger 18 56 5.2E-12 T 18-04-2019 IPR000315 B-box-type zinc finger > GO:0008270 > ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF00622 SPRY > domain 287 391 2.2E-14 T 18-04-2019 IPR003877 SPRY domain GO:0005515 > ThuMac08369-RA 7aee1da5a47975ab8e43b68bfd1a117c 139 Pfam PF00076 RNA > recognition motif. (a.k.a. RRM, RBD, or RNP domain) 22 87 1.6E-15 T > 18-04-2019 IPR000504 RNA recognition motif domain GO:0003676 > ThuMac26054-RA 8f4119609312bd6442f8bb094c104231 462 Pfam PF07565 Band 3 > cytoplasmic domain 173 443 7.3E-100 T 18-04-2019 IPR013769 Band 3 > cytoplasmic domain GO:0006820|GO:0008509|GO:0016021 Reactome: R-HSA-425381 > ThuMac07958-RA d2b749fa573a5e452cadee56090c9588 804 Pfam PF03372 Endonuclease/Exonuclease/phosphatase > family 235 535 7.0E-11 T 18-04-2019 IPR005135 > Endonuclease/exonuclease/phosphatase > ThuMac07958-RA d2b749fa573a5e452cadee56090c9588 804 Pfam PF17751 SKICH > domain 555 649 9.8E-23 T 18-04-2019 IPR041611 SKICH domain > > > map_data_ids genome.all.map output.renamed.iprscan > WARNING: No mapping available for ThuMac01937-RA > WARNING: No mapping available for ThuMac02226-RA > WARNING: No mapping available for ThuMac20730-RA > WARNING: No mapping available for ThuMac20730-RA > WARNING: No mapping available for ThuMac14750-RA > (Thousands of warnings like these were returned) > > > maker_functional_gff uniprot_sprot.db output.renamed.blastp > genome.all.renamed.gff > genome.all.renamed.putative_function.gff > Can't use string ("") as a HASH ref while "strict refs" in use at > /root/maker/bin/maker_functional_gff line 55, <$IN> line 3. > > > head genome.all.renamed.putative_function.gff > ##gff-version 3 > scf7180000008677_pilon_pilon . contig 1 49996 . . . > ID=scf7180000008677_pilon_pilon;Name=scf7180000008677_pilon_pilon > > Thanks in Advance, > > Paul Sheridan > > -- > CSO at Tupac Bio > Email: paul at tupac.bio > Homepage: www.paulsheridan.net > Mobile: +81 80 7889 0859 > _______________________________________________ > maker-devel mailing list > maker-devel at yandell-lab.org > http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org > > > -- CSO at Tupac Bio Email: paul at tupac.bio Homepage: www.paulsheridan.net Mobile: +81 80 7889 0859 -------------- next part -------------- An HTML attachment was scrubbed... URL: From bastian.schiffthaler at umu.se Wed Apr 24 02:43:15 2019 From: bastian.schiffthaler at umu.se (Bastian Schiffthaler) Date: Wed, 24 Apr 2019 07:43:15 -0000 Subject: [maker-devel] Redundant FASTA headers Message-ID: <251d38f5-c15a-6070-fcd9-d6144744885e@umu.se> Hi, I'm running the MPI version of MAKER and I'm supplying seven different trinity assemblies (different experiments) as evidence. Now trinity will not generate unique FASTA headers >across< files, so I'm wondering if there could be an issue with ID collision? What does MAKER use the headers for? Could it create race conditions in temp files? Thanks in advance, Bastian From Christian_jpg2 at hotmail.com Tue Apr 30 13:42:31 2019 From: Christian_jpg2 at hotmail.com (Christian Ayala) Date: Tue, 30 Apr 2019 18:42:31 +0000 Subject: [maker-devel] Running out of time in MAKER Message-ID: Good afternoon, I am trying to annotate some insect genomes using MAKER. MAKER is running in a system that uses a PBS scheduler and has a walltime of 120 hours. So , my jobs are running out of time and are killed before MAKER finishes the annotation. Is there a way to resume a killed MAKER run? Thanks for your help. Best regards, Christian Ayala-Ortiz -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul at tupac.bio Thu Apr 4 16:46:12 2019 From: paul at tupac.bio (Paul Sheridan) Date: Fri, 5 Apr 2019 07:46:12 +0900 Subject: [maker-devel] Running SNAP with MAKER Message-ID: Dear MAKER Team, I am running MAKER 2.31.10 a 32 core instance. My first pass completed successfully. However, my second pass using SNAP and Augustus trained ab initio gene predictions failed. Here is some example output which illustrates the problem: MAKER WARNING: Changes in control files make re-use of all old data impossible All old files will be erased before continuing processing all repeats doing repeat masking doing repeat masking #--------------------------------------------------------------------- Now starting the contig!! SeqID: scf7180000008677_pilon_pilon Length: 49996 #--------------------------------------------------------------------- doing repeat masking preparing ab-inits running snap. #--------- command -------------# Widget::snap: /usr/bin/snap /root/tuna-round-2/genome.maker.output/snap/round1/genome.hmm /tmp/maker_8RuX8Z/scf718 0000006915_pilon_pilon.abinit_masked.0 > /tmp/maker_8RuX8Z/scf7180000006915_pilon_pilon.abinit_maske d.0.genome%2Ehmm.snap #-------------------------------# setting up GFF3 output and fasta chunks processing all repeats doing repeat masking in cluster::shadow_cluster... ...finished clustering. error: unknown command "/root/tuna-round-2/genome.maker.output/snap/round1/genome.hmm", see 'snap help' ERROR: Snap failed --> rank=21, hostname=localhost ERROR: Failed while preparing ab-inits ERROR: Chunk failed at level:0, tier_type:2 FAILED CONTIG:scf7180000007575_pilon_pilon I confirmed that the path to genome.hmm is correct. In addition, run.log contains the following kind of output: STARTED genome.maker.output/genome_datastore/00/6E/scf7180000008677_pilon_pilon//theVoid.scf7180000008677_pilon_pilon/scf7180000008677_pilon_pilon.abinit_m asked.0.genome%2Ehmm.snap DIED RANK 30:4:0:0 DIED COUNT 2 DIED RANK 30 DIED COUNT 2 How can I resolve this issue? Also, is the warning about it being impossible to use the old data to be expected? Attached files: - maker_otps1.ctl: first pass control file - maker_opts2.ctl: second pass control file - run.log: log file for an example contig Thanks in Advance, Paul Sheridan -- CSO at Tupac Bio Email: paul at tupac.bio Homepage: www.paulsheridan.net Mobile: +81 80 7889 0859 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts2.ctl Type: application/octet-stream Size: 4727 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: run.log Type: application/octet-stream Size: 2366 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts1.ctl Type: application/octet-stream Size: 4514 bytes Desc: not available URL: From carsonhh at gmail.com Sat Apr 6 15:00:14 2019 From: carsonhh at gmail.com (Carson Holt) Date: Sat, 6 Apr 2019 15:00:14 -0600 Subject: [maker-devel] Running SNAP with MAKER In-Reply-To: References: Message-ID: <1B661B06-3142-40D8-AEF5-A387397AA91E@gmail.com> The error is being thrown by snap itself. Perhaps there is an issue with the genome.hmm file. Did you generate the file immediately previously to this run? Perhaps you can redo that process, and review any errors that come up during training. Some details on training SNAP from the wiki ?> http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial_for_WGS_Assembly_and_Annotation_Winter_School_2018#Training_ab_initio_Gene_Predictors ?Carson > On Apr 4, 2019, at 4:46 PM, Paul Sheridan wrote: > > Dear MAKER Team, > > I am running MAKER 2.31.10 a 32 core instance. My first pass completed successfully. However, my second pass using SNAP and Augustus trained ab initio gene predictions failed. Here is some example output which illustrates the problem: > > MAKER WARNING: Changes in control files make re-use of all old data impossible > All old files will be erased before continuing > processing all repeats > doing repeat masking > doing repeat masking > #--------------------------------------------------------------------- > Now starting the contig!! > SeqID: scf7180000008677_pilon_pilon > Length: 49996 > #--------------------------------------------------------------------- > > doing repeat masking > preparing ab-inits > running snap. > #--------- command -------------# > Widget::snap: > /usr/bin/snap /root/tuna-round-2/genome.maker.output/snap/round1/genome.hmm /tmp/maker_8RuX8Z/scf718 > 0000006915_pilon_pilon.abinit_masked.0 > /tmp/maker_8RuX8Z/scf7180000006915_pilon_pilon.abinit_maske > d.0.genome%2Ehmm.snap > #-------------------------------# > setting up GFF3 output and fasta chunks > processing all repeats > doing repeat masking > in cluster::shadow_cluster... > ...finished clustering. > error: unknown command "/root/tuna-round-2/genome.maker.output/snap/round1/genome.hmm", see 'snap help' > ERROR: Snap failed > --> rank=21, hostname=localhost > ERROR: Failed while preparing ab-inits > ERROR: Chunk failed at level:0, tier_type:2 > FAILED CONTIG:scf7180000007575_pilon_pilon > > I confirmed that the path to genome.hmm is correct. In addition, run.log contains the following kind of output: > > STARTED genome.maker.output/genome_datastore/00/6E/scf7180000008677_pilon_pilon//theVoid.scf7180000008677_pilon_pilon/scf7180000008677_pilon_pilon.abinit_m > asked.0.genome%2Ehmm.snap > DIED RANK 30:4:0:0 > DIED COUNT 2 > DIED RANK 30 > DIED COUNT 2 > > How can I resolve this issue? > > Also, is the warning about it being impossible to use the old data to be expected? > > Attached files: > - maker_otps1.ctl: first pass control file > - maker_opts2.ctl: second pass control file > - run.log: log file for an example contig > > Thanks in Advance, > > Paul Sheridan > > -- > CSO at Tupac Bio > Email: paul at tupac.bio > Homepage: www.paulsheridan.net > Mobile: +81 80 7889 0859 > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul at tupac.bio Sun Apr 7 03:27:53 2019 From: paul at tupac.bio (Paul Sheridan) Date: Sun, 7 Apr 2019 18:27:53 +0900 Subject: [maker-devel] Running SNAP with MAKER In-Reply-To: <1B661B06-3142-40D8-AEF5-A387397AA91E@gmail.com> References: <1B661B06-3142-40D8-AEF5-A387397AA91E@gmail.com> Message-ID: Hi Carson, Indeed, I did generate the hmm file immediately previously to my second run. I redid the process by following these commands from the link you supplied: mkdir snap cd snap gff3_merge -d /root/tuna-round-2/genome.maker.output/genome_master_datastore_index.log maker2zff genome.all.gff fathom -categorize 1000 genome.ann genome.dna fathom -export 1000 -plus uni.ann uni.dna forge export.ann export.dna hmm-assembler.pl genome . > ../genome1.hmm I didn't find any errors generated by Snap during training. But when I reran MAKER, I got errors of this variety: processing all repeats processing all repeats error: unknown command "/root/tuna-round-2/genome.maker.output/genome1.hmm", see 'snap help' error: unknown command "/root/tuna-round-2/genome.maker.output/genome1.hmm", see 'snap help' preparing masked sequence processing all repeats collecting blastx repeatmasking preparing masked sequence collecting blastx repeatmasking collecting blastx repeatmasking processing all repeats processing all repeats processing all repeats processing all repeats preparing masked sequence collecting blastx repeatmasking ERROR: Snap failed --> rank=21, hostname=localhost ERROR: Failed while preparing ab-inits ERROR: Chunk failed at level:0, tier_type:2 FAILED CONTIG:scf7180000008536_pilon_pilon ERROR: Snap failed --> rank=5, hostname=localhost ERROR: Failed while preparing ab-inits ERROR: Chunk failed at level:0, tier_type:2 FAILED CONTIG:scf7180000008522_pilon_pilon ERROR: Chunk failed at level:4, tier_type:0 FAILED CONTIG:scf7180000008536_pilon_pilon preparing masked sequence ERROR: Chunk failed at level:4, tier_type:0 FAILED CONTIG:scf7180000008522_pilon_pilon Do you have any other suggestions? Thanks in Advance, Paul On Sun, Apr 7, 2019 at 6:00 AM Carson Holt wrote: > The error is being thrown by snap itself. Perhaps there is an issue with > the genome.hmm file. Did you generate the file immediately previously to > this run? Perhaps you can redo that process, and review any errors that > come up during training. > > Some details on training SNAP from the wiki ?> > http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial_for_WGS_Assembly_and_Annotation_Winter_School_2018#Training_ab_initio_Gene_Predictors > > ?Carson > > > On Apr 4, 2019, at 4:46 PM, Paul Sheridan wrote: > > Dear MAKER Team, > > I am running MAKER 2.31.10 a 32 core instance. My first pass completed > successfully. However, my second pass using SNAP and Augustus trained ab > initio gene predictions failed. Here is some example output which > illustrates the problem: > > MAKER WARNING: Changes in control files make re-use of all old data > impossible > All old files will be erased before continuing > processing all repeats > doing repeat masking > doing repeat masking > #--------------------------------------------------------------------- > Now starting the contig!! > SeqID: scf7180000008677_pilon_pilon > Length: 49996 > #--------------------------------------------------------------------- > > doing repeat masking > preparing ab-inits > running snap. > #--------- command -------------# > Widget::snap: > /usr/bin/snap > /root/tuna-round-2/genome.maker.output/snap/round1/genome.hmm > /tmp/maker_8RuX8Z/scf718 > 0000006915_pilon_pilon.abinit_masked.0 > > /tmp/maker_8RuX8Z/scf7180000006915_pilon_pilon.abinit_maske > d.0.genome%2Ehmm.snap > #-------------------------------# > setting up GFF3 output and fasta chunks > processing all repeats > doing repeat masking > in cluster::shadow_cluster... > ...finished clustering. > error: unknown command > "/root/tuna-round-2/genome.maker.output/snap/round1/genome.hmm", see 'snap > help' > ERROR: Snap failed > --> rank=21, hostname=localhost > ERROR: Failed while preparing ab-inits > ERROR: Chunk failed at level:0, tier_type:2 > FAILED CONTIG:scf7180000007575_pilon_pilon > > I confirmed that the path to genome.hmm is correct. In addition, run.log > contains the following kind of output: > > STARTED > genome.maker.output/genome_datastore/00/6E/scf7180000008677_pilon_pilon//theVoid.scf7180000008677_pilon_pilon/scf7180000008677_pilon_pilon.abinit_m > asked.0.genome%2Ehmm.snap > DIED RANK 30:4:0:0 > DIED COUNT 2 > DIED RANK 30 > DIED COUNT 2 > > How can I resolve this issue? > > Also, is the warning about it being impossible to use the old data to be > expected? > > Attached files: > - maker_otps1.ctl: first pass control file > - maker_opts2.ctl: second pass control file > - run.log: log file for an example contig > > Thanks in Advance, > > Paul Sheridan > > -- > CSO at Tupac Bio > Email: paul at tupac.bio > Homepage: www.paulsheridan.net > Mobile: +81 80 7889 0859 > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > -- CSO at Tupac Bio Email: paul at tupac.bio Homepage: www.paulsheridan.net Mobile: +81 80 7889 0859 -------------- next part -------------- An HTML attachment was scrubbed... URL: From liorglic at mail.tau.ac.il Sun Apr 7 07:25:22 2019 From: liorglic at mail.tau.ac.il (Lior Glick) Date: Sun, 7 Apr 2019 16:25:22 +0300 Subject: [maker-devel] Curious pattern in AED distributions Message-ID: Hi MAKER users, Lately I've been performing annotations for multiple genomes from the same species. When plotting the histogram of AED scores over all genes, I repeatedly see a very specific pattern, that looks something like this: [image: AED_hist.png] This pattern is a bit surprising to me, in two aspects: 1) Why is there a surge towards 0.5? 2) Why is there a sudden drop right after that surge? Has anyone else seen this, or is this a specific outcome of my data/configuration? Any ideas of what may cause such a distribution? While this is not necessarily an indication of a problem or bug, it does seem a bit odd, and might imply some bias or artifact. Would appreciate your comments. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: AED_hist.png Type: image/png Size: 8232 bytes Desc: not available URL: From myandell at genetics.utah.edu Sun Apr 7 09:11:36 2019 From: myandell at genetics.utah.edu (Mark Yandell) Date: Sun, 7 Apr 2019 15:11:36 +0000 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: Message-ID: <805319DB-37C6-4802-A5A0-F74BFBD7BAA1@umail.utah.edu> Hi Lior, Fun! The short answer is I don?t know. Obviously, the good stuff is on the right side of 0.5. That said, I can think of a couple of things to look into to explain the left side of the graph. Are you allowing single exon genes? Are you using RNA seq data, protein, or both? What about repeat masking? Are you doing it? Do you have your own library? My first guess, would be low complexity/repeat sequences generating more or less random blastx hits across the genome?Carson, what do you think? And finally, what does the AED look like for the genes included in the final build? Sorry for all the questions, Lior. That?s your punishment for asking an interesting one. ? --mark From: maker-devel on behalf of Lior Glick Date: Sunday, April 7, 2019 at 7:26 AM To: "maker-devel at yandell-lab.org" Subject: [maker-devel] Curious pattern in AED distributions Hi MAKER users, Lately I've been performing annotations for multiple genomes from the same species. When plotting the histogram of AED scores over all genes, I repeatedly see a very specific pattern, that looks something like this: [AED_hist.png] This pattern is a bit surprising to me, in two aspects: 1) Why is there a surge towards 0.5? 2) Why is there a sudden drop right after that surge? Has anyone else seen this, or is this a specific outcome of my data/configuration? Any ideas of what may cause such a distribution? While this is not necessarily an indication of a problem or bug, it does seem a bit odd, and might imply some bias or artifact. Would appreciate your comments. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 8233 bytes Desc: image001.png URL: From myandell at genetics.utah.edu Sun Apr 7 11:39:16 2019 From: myandell at genetics.utah.edu (Mark Yandell) Date: Sun, 7 Apr 2019 17:39:16 +0000 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: <805319DB-37C6-4802-A5A0-F74BFBD7BAA1@umail.utah.edu> Message-ID: <116090CF-13B6-4E54-A5AA-8F7D7FCF2F23@umail.utah.edu> ? Sorry. I?m dyslexic, especially early in the morning. Yes, good stuff is on the left. As regards single exon genes, that?s always a hard call, as these have a higher false positive rate. Things to consider are how prevalent are introns in your org? Cason can give more advice on this point, I?m sure. ? ? By ?"final build", I meant is this using the ?Standard build? or ?Max Build? protocol from PMC4286374? From: Lior Glick Date: Sunday, April 7, 2019 at 10:29 AM To: Mark Yandell Cc: "liorglic at mail.tau.ac.il" , "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] Curious pattern in AED distributions Dear Mark, Thank you for the quick reply. I'm happy to see this ignites your interest and am willing to endure your punishing questions (; Before I answer them, I just want to make sure we're on the same page - as far as I understand, lower AED scores indicate higher agreement with the evidence, so the "good stuff" is actually left of the 0.5 surge. Am I correct? Otherwise, this is a very poor annotation... Now for the questions: 1) I did not make any filtrations so far, so single exon genes are included as well. in fact, I'm exploring the results in order to develop some criteria for filtering the genes. Would you suggest discarding single exon genes? 2) My evidence consist of assembled transcripts, proteins and predicted gene models (pred_gff). 3) As for repeats, I'm masking based on a repeats library obtained from a previous publication, specific to my organism of interest. Unfortunately, I didn't understand your final question. Could you please explain what you mean by "final build"? Hope these answers are helpful, and waiting to hear more thoughts. Thanks again. On Sun, Apr 7, 2019, 18:11 Mark Yandell > wrote: Hi Lior, Fun! The short answer is I don?t know. Obviously, the good stuff is on the right side of 0.5. That said, I can think of a couple of things to look into to explain the left side of the graph. Are you allowing single exon genes? Are you using RNA seq data, protein, or both? What about repeat masking? Are you doing it? Do you have your own library? My first guess, would be low complexity/repeat sequences generating more or less random blastx hits across the genome?Carson, what do you think? And finally, what does the AED look like for the genes included in the final build? Sorry for all the questions, Lior. That?s your punishment for asking an interesting one. ? --mark From: maker-devel > on behalf of Lior Glick > Date: Sunday, April 7, 2019 at 7:26 AM To: "maker-devel at yandell-lab.org" > Subject: [maker-devel] Curious pattern in AED distributions Hi MAKER users, Lately I've been performing annotations for multiple genomes from the same species. When plotting the histogram of AED scores over all genes, I repeatedly see a very specific pattern, that looks something like this: [AED_hist.png] This pattern is a bit surprising to me, in two aspects: 1) Why is there a surge towards 0.5? 2) Why is there a sudden drop right after that surge? Has anyone else seen this, or is this a specific outcome of my data/configuration? Any ideas of what may cause such a distribution? While this is not necessarily an indication of a problem or bug, it does seem a bit odd, and might imply some bias or artifact. Would appreciate your comments. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From ychliu at genetics.ac.cn Tue Apr 2 19:21:33 2019 From: ychliu at genetics.ac.cn (ychliu at genetics.ac.cn) Date: Wed, 3 Apr 2019 09:21:33 +0800 Subject: [maker-devel] MAKER problem with gff3 file Message-ID: <2019040309213197334742@genetics.ac.cn> Dear MAKER developers, I recently use the MAKER to do gene annotation. But even I use the gff3 file as the EST evidence, the result shows no gene that marked by est2genome (I do use the parameter est2genome=1). It may means that the gff3 seems doesn't work. So what's the problem? How can I solve it? Eager for you assistance. Faithfully yours. Yucheng Liu Yucheng Liu Institute of Genetics and Developmental Biology, CAS Beijing, 100101 China Tel: 86-010-64801362 E-mail: ychliu at genetics.ac.cn -------------- next part -------------- An HTML attachment was scrubbed... URL: From liorglck at gmail.com Sun Apr 7 10:29:13 2019 From: liorglck at gmail.com (Lior Glick) Date: Sun, 7 Apr 2019 19:29:13 +0300 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: <805319DB-37C6-4802-A5A0-F74BFBD7BAA1@umail.utah.edu> References: <805319DB-37C6-4802-A5A0-F74BFBD7BAA1@umail.utah.edu> Message-ID: Dear Mark, Thank you for the quick reply. I'm happy to see this ignites your interest and am willing to endure your punishing questions (; Before I answer them, I just want to make sure we're on the same page - as far as I understand, lower AED scores indicate higher agreement with the evidence, so the "good stuff" is actually left of the 0.5 surge. Am I correct? Otherwise, this is a very poor annotation... Now for the questions: 1) I did not make any filtrations so far, so single exon genes are included as well. in fact, I'm exploring the results in order to develop some criteria for filtering the genes. Would you suggest discarding single exon genes? 2) My evidence consist of assembled transcripts, proteins and predicted gene models (pred_gff). 3) As for repeats, I'm masking based on a repeats library obtained from a previous publication, specific to my organism of interest. Unfortunately, I didn't understand your final question. Could you please explain what you mean by "final build"? Hope these answers are helpful, and waiting to hear more thoughts. Thanks again. On Sun, Apr 7, 2019, 18:11 Mark Yandell wrote: > Hi Lior, > > > > > > Fun! The short answer is I don?t know. Obviously, the good stuff is on the > right side of 0.5. > > That said, I can think of a couple of things to look into to explain the > left side of the graph. Are you allowing single exon genes? Are you using > RNA seq data, protein, or both? What about repeat masking? Are you doing > it? Do you have your own library? > > > > My first guess, would be low complexity/repeat sequences generating more > or less random blastx hits across the genome?Carson, what do you think? > > > > And finally, what does the AED look like for the genes included in the > final build? > > > > > > Sorry for all the questions, Lior. That?s your punishment for asking an > interesting one. ? > > > > --mark > > > > > > *From: *maker-devel on behalf of > Lior Glick > *Date: *Sunday, April 7, 2019 at 7:26 AM > *To: *"maker-devel at yandell-lab.org" > *Subject: *[maker-devel] Curious pattern in AED distributions > > > > Hi MAKER users, > > Lately I've been performing annotations for multiple genomes from the same > species. > > When plotting the histogram of AED scores over all genes, I repeatedly see > a very specific pattern, that looks something like this: > > [image: AED_hist.png] > > This pattern is a bit surprising to me, in two aspects: > > 1) Why is there a surge towards 0.5? > > 2) Why is there a sudden drop right after that surge? > > > > Has anyone else seen this, or is this a specific outcome of my > data/configuration? > > Any ideas of what may cause such a distribution? > > > > While this is not necessarily an indication of a problem or bug, it does > seem a bit odd, and might imply some bias or artifact. > > Would appreciate your comments. > > Thank you! > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 8233 bytes Desc: not available URL: From carsonhh at gmail.com Sun Apr 7 19:06:49 2019 From: carsonhh at gmail.com (Carson Holt) Date: Sun, 7 Apr 2019 19:06:49 -0600 Subject: [maker-devel] MAKER problem with gff3 file In-Reply-To: <2019040309213197334742@genetics.ac.cn> References: <2019040309213197334742@genetics.ac.cn> Message-ID: <961D15D1-36C0-4DD9-BE81-7C652A2C4CCF@gmail.com> The est2genome=1 one option in MAKER2 only works with input fasta files because it?s based on Exonerate?s est2genome alignments. It does not with with GFF3 input (gff3 is missing some things that are in the exonerate report). MAKER3 however will let you do this with GFF3 input (it goes back and tries to predict missing info that Exonerate would have produced). ?Carson > On Apr 2, 2019, at 7:21 PM, ychliu at genetics.ac.cn wrote: > > Dear MAKER developers, > I recently use the MAKER to do gene annotation. But even I use the gff3 file as the EST evidence, the result shows no gene that marked by est2genome (I do use the parameter est2genome=1). It may means that the gff3 seems doesn't work. So what's the problem? How can I solve it? Eager for you assistance. > Faithfully yours. > Yucheng Liu > > Yucheng Liu > Institute of Genetics and Developmental Biology, CAS > Beijing, 100101 China > Tel: 86-010-64801362 > E-mail: ychliu at genetics.ac.cn _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Sun Apr 7 19:08:54 2019 From: carsonhh at gmail.com (Carson Holt) Date: Sun, 7 Apr 2019 19:08:54 -0600 Subject: [maker-devel] Installation Failure with pg_config? In-Reply-To: References: Message-ID: <89174279-57D0-46D3-BE9D-FA03ED861227@gmail.com> DBD::Pg is optional. You should be able to say ?No? to the question on whether you want to install optional modules during the Build step. ?Carson > On Mar 14, 2019, at 4:24 PM, Shaowen Jiang wrote: > > Dear MAKER2 admins: > > Hi, I have read some tutorials for annotating a newly assembly genome and MAKER2 seems to be a very good and functional pipeline to me. So I am trying to use it to annotate a new assembly mammalian genome that our lab just generated. > But I was stuck while I was trying to install MAKER2 to our slurm HPC server. > I think the pipeline is trying to install several perl packages locally, but one of them called DBD::Pg requires the path of pg_config? > screenshot as below > > But I think our server doesn't have this path and I don't have root to install some other stuff, like libpq-dev or PostgreSQL. > Is that any other methods that can circle around that? > > Any help or advice would be appreciated! > > Thanks, > Shaowen > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Sun Apr 7 19:32:39 2019 From: carsonhh at gmail.com (Carson Holt) Date: Sun, 7 Apr 2019 19:32:39 -0600 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: Message-ID: That?s interesting. It could be a handful of internal filters that help with spurious results. I use a 0.5 sensitivity/specificity to identify shared edges for a jaccardian split on overlapping evidence clusters for example. There are also a couple of places where if the only thing supporting a model is a single exon blastx hit (i.e. no exonerate, ab initio model, or est splice support, but just a chunk od single exon blastx) then maker will use a reading frame aware AED value of 0.5 as a filter (as in it checks if the reading frame matches and not just raw overlap). If that?s the case, the spike near 0.5 may indicate I needed to be a little strickter than my empirical cutoff estimate. Perhaps 0.4 or 0.45 would be the better cuttoff for these spurious blastx induced models. ?Carson > On Apr 7, 2019, at 7:25 AM, Lior Glick wrote: > > Hi MAKER users, > Lately I've been performing annotations for multiple genomes from the same species. > When plotting the histogram of AED scores over all genes, I repeatedly see a very specific pattern, that looks something like this: > > This pattern is a bit surprising to me, in two aspects: > 1) Why is there a surge towards 0.5? > 2) Why is there a sudden drop right after that surge? > > Has anyone else seen this, or is this a specific outcome of my data/configuration? > Any ideas of what may cause such a distribution? > > While this is not necessarily an indication of a problem or bug, it does seem a bit odd, and might imply some bias or artifact. > Would appreciate your comments. > Thank you! > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From xvazquezc at gmail.com Sun Apr 7 22:42:15 2019 From: xvazquezc at gmail.com (=?UTF-8?Q?Xabier_V=C3=A1zquez=2DCampos?=) Date: Mon, 8 Apr 2019 14:42:15 +1000 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: Message-ID: If you train SNAP, the maker2zff script has internal quality cutoffs based on the existence of evidence. e.g. by default it will require having some EST evidence On Mon, 8 Apr 2019 at 11:32, Carson Holt wrote: > That?s interesting. It could be a handful of internal filters that help > with spurious results. > > I use a 0.5 sensitivity/specificity to identify shared edges for a > jaccardian split on overlapping evidence clusters for example. There are > also a couple of places where if the only thing supporting a model is a > single exon blastx hit (i.e. no exonerate, ab initio model, or est splice > support, but just a chunk od single exon blastx) then maker will use a > reading frame aware AED value of 0.5 as a filter (as in it checks if the > reading frame matches and not just raw overlap). If that?s the case, the > spike near 0.5 may indicate I needed to be a little strickter than my > empirical cutoff estimate. Perhaps 0.4 or 0.45 would be the better cuttoff > for these spurious blastx induced models. > > ?Carson > > > > On Apr 7, 2019, at 7:25 AM, Lior Glick wrote: > > > > Hi MAKER users, > > Lately I've been performing annotations for multiple genomes from the > same species. > > When plotting the histogram of AED scores over all genes, I repeatedly > see a very specific pattern, that looks something like this: > > > > This pattern is a bit surprising to me, in two aspects: > > 1) Why is there a surge towards 0.5? > > 2) Why is there a sudden drop right after that surge? > > > > Has anyone else seen this, or is this a specific outcome of my > data/configuration? > > Any ideas of what may cause such a distribution? > > > > While this is not necessarily an indication of a problem or bug, it does > seem a bit odd, and might imply some bias or artifact. > > Would appreciate your comments. > > Thank you! > > _______________________________________________ > > maker-devel mailing list > > maker-devel at box290.bluehost.com > > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -- Xabier V?zquez-Campos, *PhD* *Research Associate* NSW Systems Biology Initiative School of Biotechnology and Biomolecular Sciences The University of New South Wales Sydney NSW 2052 AUSTRALIA -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Sun Apr 7 23:20:24 2019 From: carsonhh at gmail.com (Carson Holt) Date: Sun, 7 Apr 2019 23:20:24 -0600 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: Message-ID: <480A2430-C312-4A43-B659-4694B4F8E61A@gmail.com> Yes. maker2zff tries to further select a subset of the best supported models by requiring multiple forms of evidence support. ?Carson > On Apr 7, 2019, at 10:42 PM, Xabier V?zquez-Campos wrote: > > If you train SNAP, the maker2zff script has internal quality cutoffs based on the existence of evidence. e.g. by default it will require having some EST evidence > > On Mon, 8 Apr 2019 at 11:32, Carson Holt > wrote: > That?s interesting. It could be a handful of internal filters that help with spurious results. > > I use a 0.5 sensitivity/specificity to identify shared edges for a jaccardian split on overlapping evidence clusters for example. There are also a couple of places where if the only thing supporting a model is a single exon blastx hit (i.e. no exonerate, ab initio model, or est splice support, but just a chunk od single exon blastx) then maker will use a reading frame aware AED value of 0.5 as a filter (as in it checks if the reading frame matches and not just raw overlap). If that?s the case, the spike near 0.5 may indicate I needed to be a little strickter than my empirical cutoff estimate. Perhaps 0.4 or 0.45 would be the better cuttoff for these spurious blastx induced models. > > ?Carson > > > > On Apr 7, 2019, at 7:25 AM, Lior Glick > wrote: > > > > Hi MAKER users, > > Lately I've been performing annotations for multiple genomes from the same species. > > When plotting the histogram of AED scores over all genes, I repeatedly see a very specific pattern, that looks something like this: > > > > This pattern is a bit surprising to me, in two aspects: > > 1) Why is there a surge towards 0.5? > > 2) Why is there a sudden drop right after that surge? > > > > Has anyone else seen this, or is this a specific outcome of my data/configuration? > > Any ideas of what may cause such a distribution? > > > > While this is not necessarily an indication of a problem or bug, it does seem a bit odd, and might imply some bias or artifact. > > Would appreciate your comments. > > Thank you! > > _______________________________________________ > > maker-devel mailing list > > maker-devel at box290.bluehost.com > > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > -- > Xabier V?zquez-Campos, PhD > Research Associate > NSW Systems Biology Initiative > School of Biotechnology and Biomolecular Sciences > The University of New South Wales > Sydney NSW 2052 AUSTRALIA -------------- next part -------------- An HTML attachment was scrubbed... URL: From liorglic at mail.tau.ac.il Mon Apr 8 00:54:06 2019 From: liorglic at mail.tau.ac.il (Lior Glick) Date: Mon, 8 Apr 2019 09:54:06 +0300 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: <480A2430-C312-4A43-B659-4694B4F8E61A@gmail.com> References: <480A2430-C312-4A43-B659-4694B4F8E61A@gmail.com> Message-ID: Hello again and thank you all for your interesting answers. I mistakenly answered Mark yesterday from an unsubscribed mail, which resulted in only him getting it, so for documentation sake, I'm posting my answer here again, and Mark's reply: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Dear Mark, Thank you for the quick reply. I'm happy to see this ignites your interest and am willing to endure your punishing questions (; Before I answer them, I just want to make sure we're on the same page - as far as I understand, lower AED scores indicate higher agreement with the evidence, so the "good stuff" is actually left of the 0.5 surge. Am I correct? Otherwise, this is a very poor annotation... Now for the questions: 1) I did not make any filtrations so far, so single exon genes are included as well. in fact, I'm exploring the results in order to develop some criteria for filtering the genes. Would you suggest discarding single exon genes? 2) My evidence consist of assembled transcripts, proteins and predicted gene models (pred_gff). 3) As for repeats, I'm masking based on a repeats library obtained from a previous publication, specific to my organism of interest. Unfortunately, I didn't understand your final question. Could you please explain what you mean by "final build"? Hope these answers are helpful, and waiting to hear more thoughts. Thanks again. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *To which Mark replied:* Sorry. I?m dyslexic, especially early in the morning. Yes, good stuff is on the left. As regards single exon genes, that?s always a hard call, as these have a higher false positive rate. Things to consider are how prevalent are introns in your org? Cason can give more advice on this point, I?m sure. ? ? By ?"final build", I meant is this using the ?Standard build? or ?Max Build? protocol from PMC4286374? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Mark - well, as I said I haven't done any filtration yet, so I guess my annotation currently includes genes that would be discarded even with the "max build". I'll give this a try and look at the resulting distribution. Xabier - thanks, but I'm not using SNAP (just Augustus). Carson - I see a few fingers pointing in the direction of single-exon models, so maybe I should see what happens to the distribution of AED when these genes are removed. I'll get back to you with some more results. ??????? ??? ??, 8 ????? 2019 ?-8:20 ??? ?Carson Holt?? :? > Yes. maker2zff tries to further select a subset of the best supported > models by requiring multiple forms of evidence support. > > ?Carson > > > On Apr 7, 2019, at 10:42 PM, Xabier V?zquez-Campos > wrote: > > If you train SNAP, the maker2zff script has internal quality cutoffs based > on the existence of evidence. e.g. by default it will require having some > EST evidence > > On Mon, 8 Apr 2019 at 11:32, Carson Holt wrote: > >> That?s interesting. It could be a handful of internal filters that help >> with spurious results. >> >> I use a 0.5 sensitivity/specificity to identify shared edges for a >> jaccardian split on overlapping evidence clusters for example. There are >> also a couple of places where if the only thing supporting a model is a >> single exon blastx hit (i.e. no exonerate, ab initio model, or est splice >> support, but just a chunk od single exon blastx) then maker will use a >> reading frame aware AED value of 0.5 as a filter (as in it checks if the >> reading frame matches and not just raw overlap). If that?s the case, the >> spike near 0.5 may indicate I needed to be a little strickter than my >> empirical cutoff estimate. Perhaps 0.4 or 0.45 would be the better cuttoff >> for these spurious blastx induced models. >> >> ?Carson >> >> >> > On Apr 7, 2019, at 7:25 AM, Lior Glick wrote: >> > >> > Hi MAKER users, >> > Lately I've been performing annotations for multiple genomes from the >> same species. >> > When plotting the histogram of AED scores over all genes, I repeatedly >> see a very specific pattern, that looks something like this: >> > >> > This pattern is a bit surprising to me, in two aspects: >> > 1) Why is there a surge towards 0.5? >> > 2) Why is there a sudden drop right after that surge? >> > >> > Has anyone else seen this, or is this a specific outcome of my >> data/configuration? >> > Any ideas of what may cause such a distribution? >> > >> > While this is not necessarily an indication of a problem or bug, it >> does seem a bit odd, and might imply some bias or artifact. >> > Would appreciate your comments. >> > Thank you! >> > _______________________________________________ >> > maker-devel mailing list >> > maker-devel at box290.bluehost.com >> > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > > -- > Xabier V?zquez-Campos, *PhD* > *Research Associate* > NSW Systems Biology Initiative > School of Biotechnology and Biomolecular Sciences > The University of New South Wales > Sydney NSW 2052 AUSTRALIA > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From liorglic at mail.tau.ac.il Mon Apr 8 03:10:15 2019 From: liorglic at mail.tau.ac.il (Lior Glick) Date: Mon, 8 Apr 2019 12:10:15 +0300 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: <480A2430-C312-4A43-B659-4694B4F8E61A@gmail.com> Message-ID: Hi again - quick update: I made a plot comparing the histograms of single-exon genes to multi-exon genes: [image: newplot (5).png] It definitely looks like single-exon genes are *enriched* for the 0.5 score, but it does not account for the entire surge, as there also seem to be lots of multi-exon genes involved. This may suggest that the 0.5 peak is a result of multiple effects buried within the software. Any other thoughts/suggestions? Thanks again, -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: newplot (5).png Type: image/png Size: 18037 bytes Desc: not available URL: From carsonhh at gmail.com Mon Apr 8 10:48:42 2019 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 8 Apr 2019 10:48:42 -0600 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: <480A2430-C312-4A43-B659-4694B4F8E61A@gmail.com> Message-ID: One note. When I say single exon blastx hit, I mean that the evidence is single exon, not that the gene model is single exon. What I think you are seeing is an effect that seems to be partially related to under-masking, i.e. a spurious partial blastx alignment to a low complexity repeat (which is why the blastx protein alignment refuses to polish with exonerate). That is why the filter was added. So if a model (single or multi-exon) has no additional ab initio prediction support, has no EST support, and has no exonerate polished protein support, but does have a single-exon/single-hsp blastx overlap it gets filtered out at 0.5 (that threshold based on trial and error on a couple of genomes where we saw this occur - but your graph suggests that filter might be too loose and 0.4 or 0.45 might be a better value). So the spike is caused by poor blastx and under-masking (this may be explained if your are using in pred_gff models that were generated on an unmasked assembly outside of MAKER), then the drop around 0.5 is caused by MAKER filtering out models only supported by what appears to be spuious blastx alignments. ?Carson > On Apr 8, 2019, at 3:10 AM, Lior Glick wrote: > > Hi again - quick update: > I made a plot comparing the histograms of single-exon genes to multi-exon genes: > > It definitely looks like single-exon genes are enriched for the 0.5 score, but it does not account for the entire surge, as there also seem to be lots of multi-exon genes involved. This may suggest that the 0.5 peak is a result of multiple effects buried within the software. > Any other thoughts/suggestions? > > Thanks again, > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Apr 8 10:51:55 2019 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 8 Apr 2019 10:51:55 -0600 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: <480A2430-C312-4A43-B659-4694B4F8E61A@gmail.com> Message-ID: <75B7E2C9-2D1B-452F-BEED-704289C881ED@gmail.com> Try also adding 2 exon models to the graph. It would be interesting to see if these are attempted single-exon models where the predictor added a micro-intron to keep the open reading frame going against a single exon blastx hint. ?Carson > On Apr 8, 2019, at 3:10 AM, Lior Glick wrote: > > Hi again - quick update: > I made a plot comparing the histograms of single-exon genes to multi-exon genes: > > It definitely looks like single-exon genes are enriched for the 0.5 score, but it does not account for the entire surge, as there also seem to be lots of multi-exon genes involved. This may suggest that the 0.5 peak is a result of multiple effects buried within the software. > Any other thoughts/suggestions? > > Thanks again, > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ying.hu at ufl.edu Wed Apr 17 09:20:00 2019 From: ying.hu at ufl.edu (Hu,Ying) Date: Wed, 17 Apr 2019 15:20:00 +0000 Subject: [maker-devel] maker exons number Message-ID: Hi, Carson, I am using MAKER 2.31.6 to annotate a genome. I noticed that exon number in each gene does not start from 1. Can you give me some suggestions how to change the exon number to 1,2,3 .. In each gene? Thansks, Ying Here are some examples: tig00000226|arrow . contig 1 43850 . . . ID=tig00000226|arrow;Name=tig00000226|arrow tig00000226|arrow maker gene 26339 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0;Name=augustus_masked-tig00000226|arrow-processed-gene-0.0 tig00000226|arrow maker mRNA 26339 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0;Name=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1;_AED=0.47;_eAED=0.68;_QI=0|0|0|0.75|1|1|4|0|82 tig00000226|arrow maker exon 26339 26353 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:58;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker exon 27490 27636 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:59;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker exon 27738 27808 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:60;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker exon 27900 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:61;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 26339 26353 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 27490 27636 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 27738 27808 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 27900 27915 . + 1 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker gene 5803 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1;Name=maker-tig00000226|arrow-augustus-gene-0.1 tig00000226|arrow maker mRNA 5803 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1;Parent=maker-tig00000226|arrow-augustus-gene-0.1;Name=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1;_AED=0.63;_eAED=0.69;_QI=0|0|0|1|0|0|2|0|85 tig00000226|arrow maker exon 5803 5975 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:exon:62;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00000226|arrow maker exon 6505 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:exon:63;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00000226|arrow maker CDS 5803 5975 . + 0 ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00000226|arrow maker CDS 6505 6589 . + 1 ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow . contig 1 104941 . . . ID=tig00034405|arrow;Name=tig00034405|arrow tig00034405|arrow maker gene 40927 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0;Name=maker-tig00034405|arrow-augustus-gene-0.0 tig00034405|arrow maker mRNA 40927 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1;Parent=maker-tig00034405|arrow-augustus-gene-0.0;Name=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1;_AED=0.04;_eAED=0.04;_QI=266|1|1|1|0|0|3|0|100 tig00034405|arrow maker exon 40927 41273 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7157;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker exon 41476 41622 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7158;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker exon 50954 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7159;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker five_prime_UTR 40927 41192 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:five_prime_utr;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker CDS 41193 41273 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker CDS 41476 41622 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker CDS 50954 51025 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker gene 57931 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2;Name=augustus_masked-tig00034405|arrow-processed-gene-0.2 tig00034405|arrow maker mRNA 57931 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2;Name=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1;_AED=0.13;_eAED=0.13;_QI=0|0.5|0|0.66|1|1|3|0|522 tig00034405|arrow maker exon 57931 58962 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7160;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker exon 59245 59725 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7161;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker exon 61510 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7162;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker CDS 57931 58962 . + 0 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker CDS 59245 59725 . + 0 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker CDS 61510 61565 . + 2 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker gene 90355 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1;Name=maker-tig00034405|arrow-augustus-gene-0.1 tig00034405|arrow maker mRNA 90355 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1;Parent=maker-tig00034405|arrow-augustus-gene-0.1;Name=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1;_AED=0.28;_eAED=0.28;_QI=0|0.7|0.72|1|1|1|11|386|425 tig00034405|arrow maker exon 90355 90911 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7173;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 91011 91086 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7172;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 91179 91240 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7171;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 91557 91706 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7170;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 92996 93064 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7169;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93156 93347 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7168;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93453 93637 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7167;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93711 93786 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7166;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93866 93972 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7165;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 94536 94573 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7164;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 94645 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7163;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 94645 94796 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 94536 94573 . - 1 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93866 93972 . - 2 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93711 93786 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93453 93637 . - 2 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93156 93347 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 92996 93064 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 91557 91706 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 91179 91240 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 91011 91086 . - 1 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 90741 90911 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker three_prime_UTR 90355 90740 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:three_prime_utr;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From huyingwin at gmail.com Wed Apr 17 09:23:10 2019 From: huyingwin at gmail.com (YING HU) Date: Wed, 17 Apr 2019 11:23:10 -0400 Subject: [maker-devel] maker exon number Message-ID: Hi, Carson, I am using MAKER 2.31.6 to annotate a genome. I noticed that exon number in each gene does not start from 1. Can you give me some suggestions how to change the exon number to 1,2,3 .. In each gene? Thansks, Ying Here are some examples: tig00000226|arrow . contig 1 43850 . . . ID=tig00000226|arrow;Name=tig00000226|arrow tig00000226|arrow maker gene 26339 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0;Name=augustus_masked-tig00000226|arrow-processed-gene-0.0 tig00000226|arrow maker mRNA 26339 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0;Name=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1;_AED=0.47;_eAED=0.68;_QI=0|0|0|0.75|1|1|4|0|82 tig00000226|arrow maker exon 26339 26353 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:58;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker exon 27490 27636 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:59;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker exon 27738 27808 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:60;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker exon 27900 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:61;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 26339 26353 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 27490 27636 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 27738 27808 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 27900 27915 . + 1 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker gene 5803 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1;Name=maker-tig00000226|arrow-augustus-gene-0.1 tig00000226|arrow maker mRNA 5803 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1;Parent=maker-tig00000226|arrow-augustus-gene-0.1;Name=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1;_AED=0.63;_eAED=0.69;_QI=0|0|0|1|0|0|2|0|85 tig00000226|arrow maker exon 5803 5975 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:exon:62;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00000226|arrow maker exon 6505 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:exon:63;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00000226|arrow maker CDS 5803 5975 . + 0 ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00000226|arrow maker CDS 6505 6589 . + 1 ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow . contig 1 104941 . . . ID=tig00034405|arrow;Name=tig00034405|arrow tig00034405|arrow maker gene 40927 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0;Name=maker-tig00034405|arrow-augustus-gene-0.0 tig00034405|arrow maker mRNA 40927 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1;Parent=maker-tig00034405|arrow-augustus-gene-0.0;Name=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1;_AED=0.04;_eAED=0.04;_QI=266|1|1|1|0|0|3|0|100 tig00034405|arrow maker exon 40927 41273 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7157;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker exon 41476 41622 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7158;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker exon 50954 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7159;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker five_prime_UTR 40927 41192 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:five_prime_utr;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker CDS 41193 41273 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker CDS 41476 41622 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker CDS 50954 51025 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker gene 57931 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2;Name=augustus_masked-tig00034405|arrow-processed-gene-0.2 tig00034405|arrow maker mRNA 57931 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2;Name=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1;_AED=0.13;_eAED=0.13;_QI=0|0.5|0|0.66|1|1|3|0|522 tig00034405|arrow maker exon 57931 58962 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7160;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker exon 59245 59725 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7161;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker exon 61510 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7162;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker CDS 57931 58962 . + 0 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker CDS 59245 59725 . + 0 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker CDS 61510 61565 . + 2 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker gene 90355 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1;Name=maker-tig00034405|arrow-augustus-gene-0.1 tig00034405|arrow maker mRNA 90355 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1;Parent=maker-tig00034405|arrow-augustus-gene-0.1;Name=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1;_AED=0.28;_eAED=0.28;_QI=0|0.7|0.72|1|1|1|11|386|425 tig00034405|arrow maker exon 90355 90911 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7173;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 91011 91086 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7172;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 91179 91240 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7171;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 91557 91706 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7170;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 92996 93064 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7169;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93156 93347 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7168;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93453 93637 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7167;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93711 93786 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7166;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93866 93972 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7165;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 94536 94573 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7164;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 94645 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7163;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 94645 94796 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 94536 94573 . - 1 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93866 93972 . - 2 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93711 93786 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93453 93637 . - 2 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93156 93347 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 92996 93064 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 91557 91706 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 91179 91240 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 91011 91086 . - 1 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 90741 90911 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker three_prime_UTR 90355 90740 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:three_prime_utr;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Apr 17 13:43:41 2019 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 17 Apr 2019 13:43:41 -0600 Subject: [maker-devel] maker exon number In-Reply-To: References: Message-ID: The ID= value is simply a unique value to resolve inheritance in conjunction with Parent=. It has no biological meaning. Also with gff3 format, to reduce redundancy, a single ?exon? features can be the child of multiple mRNA features, so a single ?exon' line can be the first exon in one transcript but the second exon in another. ?Carson > On Apr 17, 2019, at 9:23 AM, YING HU wrote: > > Hi, Carson, > > I am using MAKER 2.31.6 to annotate a genome. I noticed that exon number in each gene does not start from 1. Can you give me some suggestions how to change the exon number to 1,2,3 .. In each gene? Thansks, > > Ying > > Here are some examples: > > tig00000226|arrow . contig 1 43850 . . . ID=tig00000226|arrow;Name=tig00000226|arrow > tig00000226|arrow maker gene 26339 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0;Name=augustus_masked-tig00000226|arrow-processed-gene-0.0 > tig00000226|arrow maker mRNA 26339 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0;Name=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1;_AED=0.47;_eAED=0.68;_QI=0|0|0|0.75|1|1|4|0|82 > tig00000226|arrow maker exon 26339 26353 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:58;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker exon 27490 27636 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:59;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker exon 27738 27808 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:60;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker exon 27900 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:61;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker CDS 26339 26353 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker CDS 27490 27636 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker CDS 27738 27808 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker CDS 27900 27915 . + 1 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker gene 5803 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1;Name=maker-tig00000226|arrow-augustus-gene-0.1 > tig00000226|arrow maker mRNA 5803 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1;Parent=maker-tig00000226|arrow-augustus-gene-0.1;Name=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1;_AED=0.63;_eAED=0.69;_QI=0|0|0|1|0|0|2|0|85 > tig00000226|arrow maker exon 5803 5975 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:exon:62;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 > tig00000226|arrow maker exon 6505 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:exon:63;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 > tig00000226|arrow maker CDS 5803 5975 . + 0 ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 > > tig00000226|arrow maker CDS 6505 6589 . + 1 ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 > > tig00034405|arrow . contig 1 104941 . . . ID=tig00034405|arrow;Name=tig00034405|arrow > tig00034405|arrow maker gene 40927 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0;Name=maker-tig00034405|arrow-augustus-gene-0.0 > tig00034405|arrow maker mRNA 40927 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1;Parent=maker-tig00034405|arrow-augustus-gene-0.0;Name=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1;_AED=0.04;_eAED=0.04;_QI=266|1|1|1|0|0|3|0|100 > tig00034405|arrow maker exon 40927 41273 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7157;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 > tig00034405|arrow maker exon 41476 41622 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7158;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 > tig00034405|arrow maker exon 50954 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7159;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 > tig00034405|arrow maker five_prime_UTR 40927 41192 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:five_prime_utr;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 > tig00034405|arrow maker CDS 41193 41273 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 > tig00034405|arrow maker CDS 41476 41622 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 > tig00034405|arrow maker CDS 50954 51025 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 > tig00034405|arrow maker gene 57931 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2;Name=augustus_masked-tig00034405|arrow-processed-gene-0.2 > tig00034405|arrow maker mRNA 57931 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2;Name=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1;_AED=0.13;_eAED=0.13;_QI=0|0.5|0|0.66|1|1|3|0|522 > tig00034405|arrow maker exon 57931 58962 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7160;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 > tig00034405|arrow maker exon 59245 59725 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7161;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 > tig00034405|arrow maker exon 61510 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7162;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 > tig00034405|arrow maker CDS 57931 58962 . + 0 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 > tig00034405|arrow maker CDS 59245 59725 . + 0 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 > tig00034405|arrow maker CDS 61510 61565 . + 2 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 > tig00034405|arrow maker gene 90355 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1;Name=maker-tig00034405|arrow-augustus-gene-0.1 > tig00034405|arrow maker mRNA 90355 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1;Parent=maker-tig00034405|arrow-augustus-gene-0.1;Name=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1;_AED=0.28;_eAED=0.28;_QI=0|0.7|0.72|1|1|1|11|386|425 > tig00034405|arrow maker exon 90355 90911 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7173;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 91011 91086 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7172;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 91179 91240 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7171;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 91557 91706 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7170;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 92996 93064 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7169;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 93156 93347 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7168;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 93453 93637 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7167;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 93711 93786 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7166;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 93866 93972 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7165;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 94536 94573 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7164;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 94645 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7163;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 94645 94796 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 94536 94573 . - 1 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 93866 93972 . - 2 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 93711 93786 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 93453 93637 . - 2 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 93156 93347 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 92996 93064 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 91557 91706 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 91179 91240 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 91011 91086 . - 1 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 90741 90911 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > > tig00034405|arrow maker three_prime_UTR 90355 90740 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:three_prime_utr;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul at tupac.bio Thu Apr 18 03:23:35 2019 From: paul at tupac.bio (Paul Sheridan) Date: Thu, 18 Apr 2019 18:23:35 +0900 Subject: [maker-devel] maker_functional_gff Error Message-ID: Dear MAKER Team, I am running MAKER 2.31.10 a 32 core instance. I followed the Post Processing of Annotations steps as described in the MAKER Tutorial for GMOD Online Training 2014 as best I could, but I get an error when I run maker_functional_gff. The commands in the order of execution and relevant output are shown below. Where did I do wrong? # run blastp command blastp -query genome.all.maker.proteins.fasta -db uniprot_sprot.fasta -num_threads 32 -evalue 1e-6 -max_hsps 1 -max_target_seqs 1 -outfmt 6 -out output.blastp # run interproscan command interproscan.sh -appl pfam -dp -f TSV -goterms -iprlookup -pa -t p -i genome.all.maker.proteins.fasta -o output.iprscan # create naming table maker_map_ids --prefix ThuMac --justify 5 genome.all.gff > genome.all.map # copy files for safe keeping cp genome.all.gff genome.all.renamed.gff cp genome.all.noseq.gff genome.all.noseq.renamed.gff cp genome.all.maker.proteins.fasta genome.all.maker.proteins.renamed.fasta cp genome.all.maker.proteins.aed.0.50.fasta genome.all.maker.proteins.aed.0.50.renamed.fasta cp genome.all.maker.unique.proteins.aed.0.50.fasta genome.all.maker.unique.proteins.aed.0.50.renamed.fasta cp genome.all.maker.transcripts.fasta genome.all.maker.transcripts.renamed.fasta cp genome.all.maker.transcripts.aed.0.50.fasta genome.all.maker.transcripts.aed.0.50.renamed.fasta cp output.iprscan output.renamed.iprscan cp output.blastp output.renamed.blastp # replace uninformative MAKER protein/transcript names with useful ones map_gff_ids genome.all.map genome.all.renamed.gff map_gff_ids genome.all.map genome.all.noseq.renamed.gff map_fasta_ids genome.all.map genome.all.maker.proteins.renamed.fasta map_fasta_ids genome.all.map genome.all.maker.proteins.aed.0.50.renamed.fasta map_fasta_ids genome.all.map genome.all.maker.unique.proteins.aed.0.50.renamed.fasta map_fasta_ids genome.all.map genome.all.maker.transcripts.renamed.fasta map_fasta_ids genome.all.map genome.all.maker.transcripts.aed.0.50.renamed.fasta map_data_ids genome.all.map output.renamed.iprscan map_data_ids genome.all.map output.renamed.blastp # assign annotations maker_functional_gff uniprot_sprot.db output.renamed.blastp genome.all.renamed.gff > genome.all.renamed.putative_function.gff > head output.renamed.blastp ThuMac30929-RA P20036 41.791 134 77 1 326 458 113 246 9.51e-28 114 ThuMac19623-RA P81018 35.714 168 87 2 1 147 1 168 8.40e-33 117 ThuMac19629-RA Q66I51 68.939 264 79 2 1 263 1 262 1.48e-130 372 ThuMac19628-RA Q61464 55.172 87 37 1 766 852 382 466 4.42e-25 119 ThuMac19627-RA P07898 48.276 58 29 1 13 69 1962 2019 3.60e-13 65.9 ThuMac19626-RA P81018 36.782 174 96 2 21 180 1 174 5.75e-36 127 ThuMac19624-RA P81018 35.057 174 99 2 21 180 1 174 2.19e-33 120 ThuMac19625-RA Q28343 32.520 123 43 2 35 117 2123 2245 7.57e-17 78.6 ThuMac19636-RA Q9QX29 90.909 110 10 0 5 114 458 567 6.45e-65 216 ThuMac19638-RA Q9QX29 57.391 115 35 3 5 114 703 808 3.06e-28 120 > head output.renamed.iprscan ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF00520 Ion transport protein 154 413 3.8E-21 T 18-04-2019 IPR005821 Ion transport domain GO:0005216|GO:0006811|GO:0016020|GO:0055085 ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF08412 Ion transport protein N-terminal 109 152 5.1E-19 T 18-04-2019 IPR013621 Ion transport N-terminal Reactome: R-HSA-1296061 ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF00027 Cyclic nucleotide-binding domain 519 601 1.0E-17 T 18-04-2019 IPR000595 Cyclic nucleotide-binding domain ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF13765 SPRY-associated domain 235 283 8.9E-23 T 18-04-2019 IPR006574 SPRY-associated ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF00643 B-box zinc finger 18 56 5.2E-12 T 18-04-2019 IPR000315 B-box-type zinc finger GO:0008270 ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF00622 SPRY domain 287 391 2.2E-14 T 18-04-2019 IPR003877 SPRY domain GO:0005515 ThuMac08369-RA 7aee1da5a47975ab8e43b68bfd1a117c 139 Pfam PF00076 RNA recognition motif. (a.k.a. RRM, RBD, or RNP domain) 22 87 1.6E-15 T 18-04-2019 IPR000504 RNA recognition motif domain GO:0003676 ThuMac26054-RA 8f4119609312bd6442f8bb094c104231 462 Pfam PF07565 Band 3 cytoplasmic domain 173 443 7.3E-100 T 18-04-2019 IPR013769 Band 3 cytoplasmic domain GO:0006820|GO:0008509|GO:0016021 Reactome: R-HSA-425381 ThuMac07958-RA d2b749fa573a5e452cadee56090c9588 804 Pfam PF03372 Endonuclease/Exonuclease/phosphatase family 235 535 7.0E-11 T 18-04-2019 IPR005135 Endonuclease/exonuclease/phosphatase ThuMac07958-RA d2b749fa573a5e452cadee56090c9588 804 Pfam PF17751 SKICH domain 555 649 9.8E-23 T 18-04-2019 IPR041611 SKICH domain > map_data_ids genome.all.map output.renamed.iprscan WARNING: No mapping available for ThuMac01937-RA WARNING: No mapping available for ThuMac02226-RA WARNING: No mapping available for ThuMac20730-RA WARNING: No mapping available for ThuMac20730-RA WARNING: No mapping available for ThuMac14750-RA (Thousands of warnings like these were returned) > maker_functional_gff uniprot_sprot.db output.renamed.blastp genome.all.renamed.gff > genome.all.renamed.putative_function.gff Can't use string ("") as a HASH ref while "strict refs" in use at /root/maker/bin/maker_functional_gff line 55, <$IN> line 3. > head genome.all.renamed.putative_function.gff ##gff-version 3 scf7180000008677_pilon_pilon . contig 1 49996 . . . ID=scf7180000008677_pilon_pilon;Name=scf7180000008677_pilon_pilon Thanks in Advance, Paul Sheridan -- CSO at Tupac Bio Email: paul at tupac.bio Homepage: www.paulsheridan.net Mobile: +81 80 7889 0859 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Apr 22 11:50:27 2019 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 22 Apr 2019 11:50:27 -0600 Subject: [maker-devel] maker_functional_gff Error In-Reply-To: References: Message-ID: <4AE21A4F-77F5-4DD8-8614-0D037F7C5209@gmail.com> This ?WARNING: No mapping available for ThuMac01937-RA? means you are running on a file that already has been renamed. The file will have names like maker-SDFGDG-gene-0.1-mRNA-1 for example, and it?s finding the name ThuMac01937-RA, which is not in the first column of the map file. So it throws a warning. The second one ?> Can't use string ("") as a HASH ref while "strict refs" in use at /root/maker/bin/maker_functional_gff line 55, <$IN> line 3. You likely have a trucated line in the GFF3. It?s missing an ID= tag. This can sometimes happen if writing to network mounted (NFS) file systems because of an ansyncrounous IO error. NFS file systems have a performance enhancement where they return SUCCESS on IO operations even and then complete the IO operation later in the background. This improves speed by letting the program advance by not blocking for the IO operation, but it reduces reliability because if the later operation is not really successful, it can?t go back and tell the program ?never mind it failed.? The result is a silent truncation of data. Not super common, but not all that rare either depending on IO load (i.e. heavy MPI with lots of writes). Find the line that?s truncated, then rerun just that contig before building the merged gff3 for everything. ?Carson > On Apr 18, 2019, at 3:23 AM, Paul Sheridan wrote: > > Dear MAKER Team, > > I am running MAKER 2.31.10 a 32 core instance. I followed the Post Processing of Annotations steps as described in the MAKER Tutorial for GMOD Online Training 2014 as best I could, but I get an error when I run maker_functional_gff. The commands in the order of execution and relevant output are shown below. > > Where did I do wrong? > > # run blastp command > blastp -query genome.all.maker.proteins.fasta -db uniprot_sprot.fasta -num_threads 32 -evalue 1e-6 -max_hsps 1 -max_target_seqs 1 -outfmt 6 -out output.blastp > > # run interproscan command > interproscan.sh -appl pfam -dp -f TSV -goterms -iprlookup -pa -t p -i genome.all.maker.proteins.fasta -o output.iprscan > > # create naming table > maker_map_ids --prefix ThuMac --justify 5 genome.all.gff > genome.all.map > > # copy files for safe keeping > cp genome.all.gff genome.all.renamed.gff > cp genome.all.noseq.gff genome.all.noseq.renamed.gff > cp genome.all.maker.proteins.fasta genome.all.maker.proteins.renamed.fasta > cp genome.all.maker.proteins.aed.0.50.fasta genome.all.maker.proteins.aed.0.50.renamed.fasta > cp genome.all.maker.unique.proteins.aed.0.50.fasta genome.all.maker.unique.proteins.aed.0.50.renamed.fasta > cp genome.all.maker.transcripts.fasta genome.all.maker.transcripts.renamed.fasta > cp genome.all.maker.transcripts.aed.0.50.fasta genome.all.maker.transcripts.aed.0.50.renamed.fasta > cp output.iprscan output.renamed.iprscan > cp output.blastp output.renamed.blastp > > # replace uninformative MAKER protein/transcript names with useful ones > map_gff_ids genome.all.map genome.all.renamed.gff > map_gff_ids genome.all.map genome.all.noseq.renamed.gff > map_fasta_ids genome.all.map genome.all.maker.proteins.renamed.fasta > map_fasta_ids genome.all.map genome.all.maker.proteins.aed.0.50.renamed.fasta > map_fasta_ids genome.all.map genome.all.maker.unique.proteins.aed.0.50.renamed.fasta > map_fasta_ids genome.all.map genome.all.maker.transcripts.renamed.fasta > map_fasta_ids genome.all.map genome.all.maker.transcripts.aed.0.50.renamed.fasta > map_data_ids genome.all.map output.renamed.iprscan > map_data_ids genome.all.map output.renamed.blastp > > # assign annotations > maker_functional_gff uniprot_sprot.db output.renamed.blastp genome.all.renamed.gff > genome.all.renamed.putative_function.gff > > > head output.renamed.blastp > ThuMac30929-RA P20036 41.791 134 77 1 326 458 113 246 9.51e-28 114 > ThuMac19623-RA P81018 35.714 168 87 2 1 147 1 168 8.40e-33 117 > ThuMac19629-RA Q66I51 68.939 264 79 2 1 263 1 262 1.48e-130 372 > ThuMac19628-RA Q61464 55.172 87 37 1 766 852 382 466 4.42e-25 119 > ThuMac19627-RA P07898 48.276 58 29 1 13 69 1962 2019 3.60e-13 65.9 > ThuMac19626-RA P81018 36.782 174 96 2 21 180 1 174 5.75e-36 127 > ThuMac19624-RA P81018 35.057 174 99 2 21 180 1 174 2.19e-33 120 > ThuMac19625-RA Q28343 32.520 123 43 2 35 117 2123 2245 7.57e-17 78.6 > ThuMac19636-RA Q9QX29 90.909 110 10 0 5 114 458 567 6.45e-65 216 > ThuMac19638-RA Q9QX29 57.391 115 35 3 5 114 703 808 3.06e-28 120 > > > head output.renamed.iprscan > ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF00520 Ion transport protein 154 413 3.8E-21 T 18-04-2019 IPR005821 Ion transport domain GO:0005216|GO:0006811|GO:0016020|GO:0055085 > ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF08412 Ion transport protein N-terminal 109 152 5.1E-19 T 18-04-2019 IPR013621 Ion transport N-terminal Reactome: R-HSA-1296061 > ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF00027 Cyclic nucleotide-binding domain 519 601 1.0E-17 T 18-04-2019 IPR000595 Cyclic nucleotide-binding domain > ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF13765 SPRY-associated domain 235 283 8.9E-23 T 18-04-2019 IPR006574 SPRY-associated > ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF00643 B-box zinc finger 18 56 5.2E-12 T 18-04-2019 IPR000315 B-box-type zinc finger GO:0008270 > ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF00622 SPRY domain 287 391 2.2E-14 T 18-04-2019 IPR003877 SPRY domain GO:0005515 > ThuMac08369-RA 7aee1da5a47975ab8e43b68bfd1a117c 139 Pfam PF00076 RNA recognition motif. (a.k.a. RRM, RBD, or RNP domain) 22 87 1.6E-15 T 18-04-2019 IPR000504 RNA recognition motif domain GO:0003676 > ThuMac26054-RA 8f4119609312bd6442f8bb094c104231 462 Pfam PF07565 Band 3 cytoplasmic domain 173 443 7.3E-100 T 18-04-2019 IPR013769 Band 3 cytoplasmic domain GO:0006820|GO:0008509|GO:0016021 Reactome: R-HSA-425381 > ThuMac07958-RA d2b749fa573a5e452cadee56090c9588 804 Pfam PF03372 Endonuclease/Exonuclease/phosphatase family 235 535 7.0E-11 T 18-04-2019 IPR005135 Endonuclease/exonuclease/phosphatase > ThuMac07958-RA d2b749fa573a5e452cadee56090c9588 804 Pfam PF17751 SKICH domain 555 649 9.8E-23 T 18-04-2019 IPR041611 SKICH domain > > > map_data_ids genome.all.map output.renamed.iprscan > WARNING: No mapping available for ThuMac01937-RA > WARNING: No mapping available for ThuMac02226-RA > WARNING: No mapping available for ThuMac20730-RA > WARNING: No mapping available for ThuMac20730-RA > WARNING: No mapping available for ThuMac14750-RA > (Thousands of warnings like these were returned) > > > maker_functional_gff uniprot_sprot.db output.renamed.blastp genome.all.renamed.gff > genome.all.renamed.putative_function.gff > Can't use string ("") as a HASH ref while "strict refs" in use at /root/maker/bin/maker_functional_gff line 55, <$IN> line 3. > > > head genome.all.renamed.putative_function.gff > ##gff-version 3 > scf7180000008677_pilon_pilon . contig 1 49996 . . . ID=scf7180000008677_pilon_pilon;Name=scf7180000008677_pilon_pilon > > Thanks in Advance, > > Paul Sheridan > > -- > CSO at Tupac Bio > Email: paul at tupac.bio > Homepage: www.paulsheridan.net > Mobile: +81 80 7889 0859 > _______________________________________________ > maker-devel mailing list > maker-devel at yandell-lab.org > http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul at tupac.bio Sun Apr 28 19:40:12 2019 From: paul at tupac.bio (Paul Sheridan) Date: Mon, 29 Apr 2019 10:40:12 +0900 Subject: [maker-devel] maker_functional_gff Error In-Reply-To: <4AE21A4F-77F5-4DD8-8614-0D037F7C5209@gmail.com> References: <4AE21A4F-77F5-4DD8-8614-0D037F7C5209@gmail.com> Message-ID: Hi Carson, Thanks, your suggestions got me sorted out. Best, Paul On Tue, Apr 23, 2019 at 2:50 AM Carson Holt wrote: > This ?WARNING: No mapping available for ThuMac01937-RA? means you are > running on a file that already has been renamed. The file will have names > like maker-SDFGDG-gene-0.1-mRNA-1 for example, and it?s finding the name > ThuMac01937-RA, which is not in the first column of the map file. So it > throws a warning. > > The second one ?> Can't use string ("") as a HASH ref while "strict refs" > in use at /root/maker/bin/maker_functional_gff line 55, <$IN> line 3. > > > You likely have a trucated line in the GFF3. It?s missing an ID= tag. This > can sometimes happen if writing to network mounted (NFS) file systems > because of an ansyncrounous IO error. NFS file systems have a performance > enhancement where they return SUCCESS on IO operations even and then > complete the IO operation later in the background. This improves speed by > letting the program advance by not blocking for the IO operation, but it > reduces reliability because if the later operation is not really > successful, it can?t go back and tell the program ?never mind it failed.? > The result is a silent truncation of data. Not super common, but not all > that rare either depending on IO load (i.e. heavy MPI with lots of writes). > Find the line that?s truncated, then rerun just that contig before building > the merged gff3 for everything. > > ?Carson > > > > On Apr 18, 2019, at 3:23 AM, Paul Sheridan wrote: > > Dear MAKER Team, > > I am running MAKER 2.31.10 a 32 core instance. I followed the Post > Processing of Annotations steps as described in the MAKER Tutorial for GMOD > Online Training 2014 as best I could, but I get an error when I run > maker_functional_gff. The commands in the order of execution and relevant > output are shown below. > > Where did I do wrong? > > # run blastp command > blastp -query genome.all.maker.proteins.fasta -db uniprot_sprot.fasta > -num_threads 32 -evalue 1e-6 -max_hsps 1 -max_target_seqs 1 -outfmt 6 -out > output.blastp > > # run interproscan command > interproscan.sh -appl pfam -dp -f TSV -goterms -iprlookup -pa -t p -i > genome.all.maker.proteins.fasta -o output.iprscan > > # create naming table > maker_map_ids --prefix ThuMac --justify 5 genome.all.gff > genome.all.map > > # copy files for safe keeping > cp genome.all.gff genome.all.renamed.gff > cp genome.all.noseq.gff genome.all.noseq.renamed.gff > cp genome.all.maker.proteins.fasta genome.all.maker.proteins.renamed.fasta > cp genome.all.maker.proteins.aed.0.50.fasta > genome.all.maker.proteins.aed.0.50.renamed.fasta > cp genome.all.maker.unique.proteins.aed.0.50.fasta > genome.all.maker.unique.proteins.aed.0.50.renamed.fasta > cp genome.all.maker.transcripts.fasta > genome.all.maker.transcripts.renamed.fasta > cp genome.all.maker.transcripts.aed.0.50.fasta > genome.all.maker.transcripts.aed.0.50.renamed.fasta > cp output.iprscan output.renamed.iprscan > cp output.blastp output.renamed.blastp > > # replace uninformative MAKER protein/transcript names with useful ones > map_gff_ids genome.all.map genome.all.renamed.gff > map_gff_ids genome.all.map genome.all.noseq.renamed.gff > map_fasta_ids genome.all.map genome.all.maker.proteins.renamed.fasta > map_fasta_ids genome.all.map > genome.all.maker.proteins.aed.0.50.renamed.fasta > map_fasta_ids genome.all.map > genome.all.maker.unique.proteins.aed.0.50.renamed.fasta > map_fasta_ids genome.all.map genome.all.maker.transcripts.renamed.fasta > map_fasta_ids genome.all.map > genome.all.maker.transcripts.aed.0.50.renamed.fasta > map_data_ids genome.all.map output.renamed.iprscan > map_data_ids genome.all.map output.renamed.blastp > > # assign annotations > maker_functional_gff uniprot_sprot.db output.renamed.blastp > genome.all.renamed.gff > genome.all.renamed.putative_function.gff > > > head output.renamed.blastp > ThuMac30929-RA P20036 41.791 134 77 1 326 458 113 246 9.51e-28 114 > ThuMac19623-RA P81018 35.714 168 87 2 1 147 1 168 8.40e-33 117 > ThuMac19629-RA Q66I51 68.939 264 79 2 1 263 1 262 1.48e-130 372 > ThuMac19628-RA Q61464 55.172 87 37 1 766 852 382 466 4.42e-25 119 > ThuMac19627-RA P07898 48.276 58 29 1 13 69 1962 2019 3.60e-13 65.9 > ThuMac19626-RA P81018 36.782 174 96 2 21 180 1 174 5.75e-36 127 > ThuMac19624-RA P81018 35.057 174 99 2 21 180 1 174 2.19e-33 120 > ThuMac19625-RA Q28343 32.520 123 43 2 35 117 2123 2245 7.57e-17 78.6 > ThuMac19636-RA Q9QX29 90.909 110 10 0 5 114 458 567 6.45e-65 216 > ThuMac19638-RA Q9QX29 57.391 115 35 3 5 114 703 808 3.06e-28 120 > > > head output.renamed.iprscan > ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF00520 Ion > transport protein 154 413 3.8E-21 T 18-04-2019 IPR005821 Ion transport > domain GO:0005216|GO:0006811|GO:0016020|GO:0055085 > ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF08412 Ion > transport protein N-terminal 109 152 5.1E-19 T 18-04-2019 IPR013621 Ion > transport N-terminal Reactome: R-HSA-1296061 > ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF00027 Cyclic > nucleotide-binding domain 519 601 1.0E-17 T 18-04-2019 IPR000595 Cyclic > nucleotide-binding domain > ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF13765 SPRY-associated > domain 235 283 8.9E-23 T 18-04-2019 IPR006574 SPRY-associated > ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF00643 B-box > zinc finger 18 56 5.2E-12 T 18-04-2019 IPR000315 B-box-type zinc finger > GO:0008270 > ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF00622 SPRY > domain 287 391 2.2E-14 T 18-04-2019 IPR003877 SPRY domain GO:0005515 > ThuMac08369-RA 7aee1da5a47975ab8e43b68bfd1a117c 139 Pfam PF00076 RNA > recognition motif. (a.k.a. RRM, RBD, or RNP domain) 22 87 1.6E-15 T > 18-04-2019 IPR000504 RNA recognition motif domain GO:0003676 > ThuMac26054-RA 8f4119609312bd6442f8bb094c104231 462 Pfam PF07565 Band 3 > cytoplasmic domain 173 443 7.3E-100 T 18-04-2019 IPR013769 Band 3 > cytoplasmic domain GO:0006820|GO:0008509|GO:0016021 Reactome: R-HSA-425381 > ThuMac07958-RA d2b749fa573a5e452cadee56090c9588 804 Pfam PF03372 Endonuclease/Exonuclease/phosphatase > family 235 535 7.0E-11 T 18-04-2019 IPR005135 > Endonuclease/exonuclease/phosphatase > ThuMac07958-RA d2b749fa573a5e452cadee56090c9588 804 Pfam PF17751 SKICH > domain 555 649 9.8E-23 T 18-04-2019 IPR041611 SKICH domain > > > map_data_ids genome.all.map output.renamed.iprscan > WARNING: No mapping available for ThuMac01937-RA > WARNING: No mapping available for ThuMac02226-RA > WARNING: No mapping available for ThuMac20730-RA > WARNING: No mapping available for ThuMac20730-RA > WARNING: No mapping available for ThuMac14750-RA > (Thousands of warnings like these were returned) > > > maker_functional_gff uniprot_sprot.db output.renamed.blastp > genome.all.renamed.gff > genome.all.renamed.putative_function.gff > Can't use string ("") as a HASH ref while "strict refs" in use at > /root/maker/bin/maker_functional_gff line 55, <$IN> line 3. > > > head genome.all.renamed.putative_function.gff > ##gff-version 3 > scf7180000008677_pilon_pilon . contig 1 49996 . . . > ID=scf7180000008677_pilon_pilon;Name=scf7180000008677_pilon_pilon > > Thanks in Advance, > > Paul Sheridan > > -- > CSO at Tupac Bio > Email: paul at tupac.bio > Homepage: www.paulsheridan.net > Mobile: +81 80 7889 0859 > _______________________________________________ > maker-devel mailing list > maker-devel at yandell-lab.org > http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org > > > -- CSO at Tupac Bio Email: paul at tupac.bio Homepage: www.paulsheridan.net Mobile: +81 80 7889 0859 -------------- next part -------------- An HTML attachment was scrubbed... URL: From bastian.schiffthaler at umu.se Wed Apr 24 01:43:15 2019 From: bastian.schiffthaler at umu.se (Bastian Schiffthaler) Date: Wed, 24 Apr 2019 07:43:15 -0000 Subject: [maker-devel] Redundant FASTA headers Message-ID: <251d38f5-c15a-6070-fcd9-d6144744885e@umu.se> Hi, I'm running the MPI version of MAKER and I'm supplying seven different trinity assemblies (different experiments) as evidence. Now trinity will not generate unique FASTA headers >across< files, so I'm wondering if there could be an issue with ID collision? What does MAKER use the headers for? Could it create race conditions in temp files? Thanks in advance, Bastian From Christian_jpg2 at hotmail.com Tue Apr 30 12:42:31 2019 From: Christian_jpg2 at hotmail.com (Christian Ayala) Date: Tue, 30 Apr 2019 18:42:31 +0000 Subject: [maker-devel] Running out of time in MAKER Message-ID: Good afternoon, I am trying to annotate some insect genomes using MAKER. MAKER is running in a system that uses a PBS scheduler and has a walltime of 120 hours. So , my jobs are running out of time and are killed before MAKER finishes the annotation. Is there a way to resume a killed MAKER run? Thanks for your help. Best regards, Christian Ayala-Ortiz -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul at tupac.bio Thu Apr 4 16:46:12 2019 From: paul at tupac.bio (Paul Sheridan) Date: Fri, 5 Apr 2019 07:46:12 +0900 Subject: [maker-devel] Running SNAP with MAKER Message-ID: Dear MAKER Team, I am running MAKER 2.31.10 a 32 core instance. My first pass completed successfully. However, my second pass using SNAP and Augustus trained ab initio gene predictions failed. Here is some example output which illustrates the problem: MAKER WARNING: Changes in control files make re-use of all old data impossible All old files will be erased before continuing processing all repeats doing repeat masking doing repeat masking #--------------------------------------------------------------------- Now starting the contig!! SeqID: scf7180000008677_pilon_pilon Length: 49996 #--------------------------------------------------------------------- doing repeat masking preparing ab-inits running snap. #--------- command -------------# Widget::snap: /usr/bin/snap /root/tuna-round-2/genome.maker.output/snap/round1/genome.hmm /tmp/maker_8RuX8Z/scf718 0000006915_pilon_pilon.abinit_masked.0 > /tmp/maker_8RuX8Z/scf7180000006915_pilon_pilon.abinit_maske d.0.genome%2Ehmm.snap #-------------------------------# setting up GFF3 output and fasta chunks processing all repeats doing repeat masking in cluster::shadow_cluster... ...finished clustering. error: unknown command "/root/tuna-round-2/genome.maker.output/snap/round1/genome.hmm", see 'snap help' ERROR: Snap failed --> rank=21, hostname=localhost ERROR: Failed while preparing ab-inits ERROR: Chunk failed at level:0, tier_type:2 FAILED CONTIG:scf7180000007575_pilon_pilon I confirmed that the path to genome.hmm is correct. In addition, run.log contains the following kind of output: STARTED genome.maker.output/genome_datastore/00/6E/scf7180000008677_pilon_pilon//theVoid.scf7180000008677_pilon_pilon/scf7180000008677_pilon_pilon.abinit_m asked.0.genome%2Ehmm.snap DIED RANK 30:4:0:0 DIED COUNT 2 DIED RANK 30 DIED COUNT 2 How can I resolve this issue? Also, is the warning about it being impossible to use the old data to be expected? Attached files: - maker_otps1.ctl: first pass control file - maker_opts2.ctl: second pass control file - run.log: log file for an example contig Thanks in Advance, Paul Sheridan -- CSO at Tupac Bio Email: paul at tupac.bio Homepage: www.paulsheridan.net Mobile: +81 80 7889 0859 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts2.ctl Type: application/octet-stream Size: 4728 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: run.log Type: application/octet-stream Size: 2366 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts1.ctl Type: application/octet-stream Size: 4515 bytes Desc: not available URL: From carsonhh at gmail.com Sat Apr 6 15:00:14 2019 From: carsonhh at gmail.com (Carson Holt) Date: Sat, 6 Apr 2019 15:00:14 -0600 Subject: [maker-devel] Running SNAP with MAKER In-Reply-To: References: Message-ID: <1B661B06-3142-40D8-AEF5-A387397AA91E@gmail.com> The error is being thrown by snap itself. Perhaps there is an issue with the genome.hmm file. Did you generate the file immediately previously to this run? Perhaps you can redo that process, and review any errors that come up during training. Some details on training SNAP from the wiki ?> http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial_for_WGS_Assembly_and_Annotation_Winter_School_2018#Training_ab_initio_Gene_Predictors ?Carson > On Apr 4, 2019, at 4:46 PM, Paul Sheridan wrote: > > Dear MAKER Team, > > I am running MAKER 2.31.10 a 32 core instance. My first pass completed successfully. However, my second pass using SNAP and Augustus trained ab initio gene predictions failed. Here is some example output which illustrates the problem: > > MAKER WARNING: Changes in control files make re-use of all old data impossible > All old files will be erased before continuing > processing all repeats > doing repeat masking > doing repeat masking > #--------------------------------------------------------------------- > Now starting the contig!! > SeqID: scf7180000008677_pilon_pilon > Length: 49996 > #--------------------------------------------------------------------- > > doing repeat masking > preparing ab-inits > running snap. > #--------- command -------------# > Widget::snap: > /usr/bin/snap /root/tuna-round-2/genome.maker.output/snap/round1/genome.hmm /tmp/maker_8RuX8Z/scf718 > 0000006915_pilon_pilon.abinit_masked.0 > /tmp/maker_8RuX8Z/scf7180000006915_pilon_pilon.abinit_maske > d.0.genome%2Ehmm.snap > #-------------------------------# > setting up GFF3 output and fasta chunks > processing all repeats > doing repeat masking > in cluster::shadow_cluster... > ...finished clustering. > error: unknown command "/root/tuna-round-2/genome.maker.output/snap/round1/genome.hmm", see 'snap help' > ERROR: Snap failed > --> rank=21, hostname=localhost > ERROR: Failed while preparing ab-inits > ERROR: Chunk failed at level:0, tier_type:2 > FAILED CONTIG:scf7180000007575_pilon_pilon > > I confirmed that the path to genome.hmm is correct. In addition, run.log contains the following kind of output: > > STARTED genome.maker.output/genome_datastore/00/6E/scf7180000008677_pilon_pilon//theVoid.scf7180000008677_pilon_pilon/scf7180000008677_pilon_pilon.abinit_m > asked.0.genome%2Ehmm.snap > DIED RANK 30:4:0:0 > DIED COUNT 2 > DIED RANK 30 > DIED COUNT 2 > > How can I resolve this issue? > > Also, is the warning about it being impossible to use the old data to be expected? > > Attached files: > - maker_otps1.ctl: first pass control file > - maker_opts2.ctl: second pass control file > - run.log: log file for an example contig > > Thanks in Advance, > > Paul Sheridan > > -- > CSO at Tupac Bio > Email: paul at tupac.bio > Homepage: www.paulsheridan.net > Mobile: +81 80 7889 0859 > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul at tupac.bio Sun Apr 7 03:27:53 2019 From: paul at tupac.bio (Paul Sheridan) Date: Sun, 7 Apr 2019 18:27:53 +0900 Subject: [maker-devel] Running SNAP with MAKER In-Reply-To: <1B661B06-3142-40D8-AEF5-A387397AA91E@gmail.com> References: <1B661B06-3142-40D8-AEF5-A387397AA91E@gmail.com> Message-ID: Hi Carson, Indeed, I did generate the hmm file immediately previously to my second run. I redid the process by following these commands from the link you supplied: mkdir snap cd snap gff3_merge -d /root/tuna-round-2/genome.maker.output/genome_master_datastore_index.log maker2zff genome.all.gff fathom -categorize 1000 genome.ann genome.dna fathom -export 1000 -plus uni.ann uni.dna forge export.ann export.dna hmm-assembler.pl genome . > ../genome1.hmm I didn't find any errors generated by Snap during training. But when I reran MAKER, I got errors of this variety: processing all repeats processing all repeats error: unknown command "/root/tuna-round-2/genome.maker.output/genome1.hmm", see 'snap help' error: unknown command "/root/tuna-round-2/genome.maker.output/genome1.hmm", see 'snap help' preparing masked sequence processing all repeats collecting blastx repeatmasking preparing masked sequence collecting blastx repeatmasking collecting blastx repeatmasking processing all repeats processing all repeats processing all repeats processing all repeats preparing masked sequence collecting blastx repeatmasking ERROR: Snap failed --> rank=21, hostname=localhost ERROR: Failed while preparing ab-inits ERROR: Chunk failed at level:0, tier_type:2 FAILED CONTIG:scf7180000008536_pilon_pilon ERROR: Snap failed --> rank=5, hostname=localhost ERROR: Failed while preparing ab-inits ERROR: Chunk failed at level:0, tier_type:2 FAILED CONTIG:scf7180000008522_pilon_pilon ERROR: Chunk failed at level:4, tier_type:0 FAILED CONTIG:scf7180000008536_pilon_pilon preparing masked sequence ERROR: Chunk failed at level:4, tier_type:0 FAILED CONTIG:scf7180000008522_pilon_pilon Do you have any other suggestions? Thanks in Advance, Paul On Sun, Apr 7, 2019 at 6:00 AM Carson Holt wrote: > The error is being thrown by snap itself. Perhaps there is an issue with > the genome.hmm file. Did you generate the file immediately previously to > this run? Perhaps you can redo that process, and review any errors that > come up during training. > > Some details on training SNAP from the wiki ?> > http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial_for_WGS_Assembly_and_Annotation_Winter_School_2018#Training_ab_initio_Gene_Predictors > > ?Carson > > > On Apr 4, 2019, at 4:46 PM, Paul Sheridan wrote: > > Dear MAKER Team, > > I am running MAKER 2.31.10 a 32 core instance. My first pass completed > successfully. However, my second pass using SNAP and Augustus trained ab > initio gene predictions failed. Here is some example output which > illustrates the problem: > > MAKER WARNING: Changes in control files make re-use of all old data > impossible > All old files will be erased before continuing > processing all repeats > doing repeat masking > doing repeat masking > #--------------------------------------------------------------------- > Now starting the contig!! > SeqID: scf7180000008677_pilon_pilon > Length: 49996 > #--------------------------------------------------------------------- > > doing repeat masking > preparing ab-inits > running snap. > #--------- command -------------# > Widget::snap: > /usr/bin/snap > /root/tuna-round-2/genome.maker.output/snap/round1/genome.hmm > /tmp/maker_8RuX8Z/scf718 > 0000006915_pilon_pilon.abinit_masked.0 > > /tmp/maker_8RuX8Z/scf7180000006915_pilon_pilon.abinit_maske > d.0.genome%2Ehmm.snap > #-------------------------------# > setting up GFF3 output and fasta chunks > processing all repeats > doing repeat masking > in cluster::shadow_cluster... > ...finished clustering. > error: unknown command > "/root/tuna-round-2/genome.maker.output/snap/round1/genome.hmm", see 'snap > help' > ERROR: Snap failed > --> rank=21, hostname=localhost > ERROR: Failed while preparing ab-inits > ERROR: Chunk failed at level:0, tier_type:2 > FAILED CONTIG:scf7180000007575_pilon_pilon > > I confirmed that the path to genome.hmm is correct. In addition, run.log > contains the following kind of output: > > STARTED > genome.maker.output/genome_datastore/00/6E/scf7180000008677_pilon_pilon//theVoid.scf7180000008677_pilon_pilon/scf7180000008677_pilon_pilon.abinit_m > asked.0.genome%2Ehmm.snap > DIED RANK 30:4:0:0 > DIED COUNT 2 > DIED RANK 30 > DIED COUNT 2 > > How can I resolve this issue? > > Also, is the warning about it being impossible to use the old data to be > expected? > > Attached files: > - maker_otps1.ctl: first pass control file > - maker_opts2.ctl: second pass control file > - run.log: log file for an example contig > > Thanks in Advance, > > Paul Sheridan > > -- > CSO at Tupac Bio > Email: paul at tupac.bio > Homepage: www.paulsheridan.net > Mobile: +81 80 7889 0859 > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > -- CSO at Tupac Bio Email: paul at tupac.bio Homepage: www.paulsheridan.net Mobile: +81 80 7889 0859 -------------- next part -------------- An HTML attachment was scrubbed... URL: From liorglic at mail.tau.ac.il Sun Apr 7 07:25:22 2019 From: liorglic at mail.tau.ac.il (Lior Glick) Date: Sun, 7 Apr 2019 16:25:22 +0300 Subject: [maker-devel] Curious pattern in AED distributions Message-ID: Hi MAKER users, Lately I've been performing annotations for multiple genomes from the same species. When plotting the histogram of AED scores over all genes, I repeatedly see a very specific pattern, that looks something like this: [image: AED_hist.png] This pattern is a bit surprising to me, in two aspects: 1) Why is there a surge towards 0.5? 2) Why is there a sudden drop right after that surge? Has anyone else seen this, or is this a specific outcome of my data/configuration? Any ideas of what may cause such a distribution? While this is not necessarily an indication of a problem or bug, it does seem a bit odd, and might imply some bias or artifact. Would appreciate your comments. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: AED_hist.png Type: image/png Size: 8232 bytes Desc: not available URL: From myandell at genetics.utah.edu Sun Apr 7 09:11:36 2019 From: myandell at genetics.utah.edu (Mark Yandell) Date: Sun, 7 Apr 2019 15:11:36 +0000 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: Message-ID: <805319DB-37C6-4802-A5A0-F74BFBD7BAA1@umail.utah.edu> Hi Lior, Fun! The short answer is I don?t know. Obviously, the good stuff is on the right side of 0.5. That said, I can think of a couple of things to look into to explain the left side of the graph. Are you allowing single exon genes? Are you using RNA seq data, protein, or both? What about repeat masking? Are you doing it? Do you have your own library? My first guess, would be low complexity/repeat sequences generating more or less random blastx hits across the genome?Carson, what do you think? And finally, what does the AED look like for the genes included in the final build? Sorry for all the questions, Lior. That?s your punishment for asking an interesting one. ? --mark From: maker-devel on behalf of Lior Glick Date: Sunday, April 7, 2019 at 7:26 AM To: "maker-devel at yandell-lab.org" Subject: [maker-devel] Curious pattern in AED distributions Hi MAKER users, Lately I've been performing annotations for multiple genomes from the same species. When plotting the histogram of AED scores over all genes, I repeatedly see a very specific pattern, that looks something like this: [AED_hist.png] This pattern is a bit surprising to me, in two aspects: 1) Why is there a surge towards 0.5? 2) Why is there a sudden drop right after that surge? Has anyone else seen this, or is this a specific outcome of my data/configuration? Any ideas of what may cause such a distribution? While this is not necessarily an indication of a problem or bug, it does seem a bit odd, and might imply some bias or artifact. Would appreciate your comments. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 8233 bytes Desc: image001.png URL: From myandell at genetics.utah.edu Sun Apr 7 11:39:16 2019 From: myandell at genetics.utah.edu (Mark Yandell) Date: Sun, 7 Apr 2019 17:39:16 +0000 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: <805319DB-37C6-4802-A5A0-F74BFBD7BAA1@umail.utah.edu> Message-ID: <116090CF-13B6-4E54-A5AA-8F7D7FCF2F23@umail.utah.edu> ? Sorry. I?m dyslexic, especially early in the morning. Yes, good stuff is on the left. As regards single exon genes, that?s always a hard call, as these have a higher false positive rate. Things to consider are how prevalent are introns in your org? Cason can give more advice on this point, I?m sure. ? ? By ?"final build", I meant is this using the ?Standard build? or ?Max Build? protocol from PMC4286374? From: Lior Glick Date: Sunday, April 7, 2019 at 10:29 AM To: Mark Yandell Cc: "liorglic at mail.tau.ac.il" , "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] Curious pattern in AED distributions Dear Mark, Thank you for the quick reply. I'm happy to see this ignites your interest and am willing to endure your punishing questions (; Before I answer them, I just want to make sure we're on the same page - as far as I understand, lower AED scores indicate higher agreement with the evidence, so the "good stuff" is actually left of the 0.5 surge. Am I correct? Otherwise, this is a very poor annotation... Now for the questions: 1) I did not make any filtrations so far, so single exon genes are included as well. in fact, I'm exploring the results in order to develop some criteria for filtering the genes. Would you suggest discarding single exon genes? 2) My evidence consist of assembled transcripts, proteins and predicted gene models (pred_gff). 3) As for repeats, I'm masking based on a repeats library obtained from a previous publication, specific to my organism of interest. Unfortunately, I didn't understand your final question. Could you please explain what you mean by "final build"? Hope these answers are helpful, and waiting to hear more thoughts. Thanks again. On Sun, Apr 7, 2019, 18:11 Mark Yandell > wrote: Hi Lior, Fun! The short answer is I don?t know. Obviously, the good stuff is on the right side of 0.5. That said, I can think of a couple of things to look into to explain the left side of the graph. Are you allowing single exon genes? Are you using RNA seq data, protein, or both? What about repeat masking? Are you doing it? Do you have your own library? My first guess, would be low complexity/repeat sequences generating more or less random blastx hits across the genome?Carson, what do you think? And finally, what does the AED look like for the genes included in the final build? Sorry for all the questions, Lior. That?s your punishment for asking an interesting one. ? --mark From: maker-devel > on behalf of Lior Glick > Date: Sunday, April 7, 2019 at 7:26 AM To: "maker-devel at yandell-lab.org" > Subject: [maker-devel] Curious pattern in AED distributions Hi MAKER users, Lately I've been performing annotations for multiple genomes from the same species. When plotting the histogram of AED scores over all genes, I repeatedly see a very specific pattern, that looks something like this: [AED_hist.png] This pattern is a bit surprising to me, in two aspects: 1) Why is there a surge towards 0.5? 2) Why is there a sudden drop right after that surge? Has anyone else seen this, or is this a specific outcome of my data/configuration? Any ideas of what may cause such a distribution? While this is not necessarily an indication of a problem or bug, it does seem a bit odd, and might imply some bias or artifact. Would appreciate your comments. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From ychliu at genetics.ac.cn Tue Apr 2 19:21:33 2019 From: ychliu at genetics.ac.cn (ychliu at genetics.ac.cn) Date: Wed, 3 Apr 2019 09:21:33 +0800 Subject: [maker-devel] MAKER problem with gff3 file Message-ID: <2019040309213197334742@genetics.ac.cn> Dear MAKER developers, I recently use the MAKER to do gene annotation. But even I use the gff3 file as the EST evidence, the result shows no gene that marked by est2genome (I do use the parameter est2genome=1). It may means that the gff3 seems doesn't work. So what's the problem? How can I solve it? Eager for you assistance. Faithfully yours. Yucheng Liu Yucheng Liu Institute of Genetics and Developmental Biology, CAS Beijing, 100101 China Tel: 86-010-64801362 E-mail: ychliu at genetics.ac.cn -------------- next part -------------- An HTML attachment was scrubbed... URL: From liorglck at gmail.com Sun Apr 7 10:29:13 2019 From: liorglck at gmail.com (Lior Glick) Date: Sun, 7 Apr 2019 19:29:13 +0300 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: <805319DB-37C6-4802-A5A0-F74BFBD7BAA1@umail.utah.edu> References: <805319DB-37C6-4802-A5A0-F74BFBD7BAA1@umail.utah.edu> Message-ID: Dear Mark, Thank you for the quick reply. I'm happy to see this ignites your interest and am willing to endure your punishing questions (; Before I answer them, I just want to make sure we're on the same page - as far as I understand, lower AED scores indicate higher agreement with the evidence, so the "good stuff" is actually left of the 0.5 surge. Am I correct? Otherwise, this is a very poor annotation... Now for the questions: 1) I did not make any filtrations so far, so single exon genes are included as well. in fact, I'm exploring the results in order to develop some criteria for filtering the genes. Would you suggest discarding single exon genes? 2) My evidence consist of assembled transcripts, proteins and predicted gene models (pred_gff). 3) As for repeats, I'm masking based on a repeats library obtained from a previous publication, specific to my organism of interest. Unfortunately, I didn't understand your final question. Could you please explain what you mean by "final build"? Hope these answers are helpful, and waiting to hear more thoughts. Thanks again. On Sun, Apr 7, 2019, 18:11 Mark Yandell wrote: > Hi Lior, > > > > > > Fun! The short answer is I don?t know. Obviously, the good stuff is on the > right side of 0.5. > > That said, I can think of a couple of things to look into to explain the > left side of the graph. Are you allowing single exon genes? Are you using > RNA seq data, protein, or both? What about repeat masking? Are you doing > it? Do you have your own library? > > > > My first guess, would be low complexity/repeat sequences generating more > or less random blastx hits across the genome?Carson, what do you think? > > > > And finally, what does the AED look like for the genes included in the > final build? > > > > > > Sorry for all the questions, Lior. That?s your punishment for asking an > interesting one. ? > > > > --mark > > > > > > *From: *maker-devel on behalf of > Lior Glick > *Date: *Sunday, April 7, 2019 at 7:26 AM > *To: *"maker-devel at yandell-lab.org" > *Subject: *[maker-devel] Curious pattern in AED distributions > > > > Hi MAKER users, > > Lately I've been performing annotations for multiple genomes from the same > species. > > When plotting the histogram of AED scores over all genes, I repeatedly see > a very specific pattern, that looks something like this: > > [image: AED_hist.png] > > This pattern is a bit surprising to me, in two aspects: > > 1) Why is there a surge towards 0.5? > > 2) Why is there a sudden drop right after that surge? > > > > Has anyone else seen this, or is this a specific outcome of my > data/configuration? > > Any ideas of what may cause such a distribution? > > > > While this is not necessarily an indication of a problem or bug, it does > seem a bit odd, and might imply some bias or artifact. > > Would appreciate your comments. > > Thank you! > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 8233 bytes Desc: not available URL: From carsonhh at gmail.com Sun Apr 7 19:06:49 2019 From: carsonhh at gmail.com (Carson Holt) Date: Sun, 7 Apr 2019 19:06:49 -0600 Subject: [maker-devel] MAKER problem with gff3 file In-Reply-To: <2019040309213197334742@genetics.ac.cn> References: <2019040309213197334742@genetics.ac.cn> Message-ID: <961D15D1-36C0-4DD9-BE81-7C652A2C4CCF@gmail.com> The est2genome=1 one option in MAKER2 only works with input fasta files because it?s based on Exonerate?s est2genome alignments. It does not with with GFF3 input (gff3 is missing some things that are in the exonerate report). MAKER3 however will let you do this with GFF3 input (it goes back and tries to predict missing info that Exonerate would have produced). ?Carson > On Apr 2, 2019, at 7:21 PM, ychliu at genetics.ac.cn wrote: > > Dear MAKER developers, > I recently use the MAKER to do gene annotation. But even I use the gff3 file as the EST evidence, the result shows no gene that marked by est2genome (I do use the parameter est2genome=1). It may means that the gff3 seems doesn't work. So what's the problem? How can I solve it? Eager for you assistance. > Faithfully yours. > Yucheng Liu > > Yucheng Liu > Institute of Genetics and Developmental Biology, CAS > Beijing, 100101 China > Tel: 86-010-64801362 > E-mail: ychliu at genetics.ac.cn _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Sun Apr 7 19:08:54 2019 From: carsonhh at gmail.com (Carson Holt) Date: Sun, 7 Apr 2019 19:08:54 -0600 Subject: [maker-devel] Installation Failure with pg_config? In-Reply-To: References: Message-ID: <89174279-57D0-46D3-BE9D-FA03ED861227@gmail.com> DBD::Pg is optional. You should be able to say ?No? to the question on whether you want to install optional modules during the Build step. ?Carson > On Mar 14, 2019, at 4:24 PM, Shaowen Jiang wrote: > > Dear MAKER2 admins: > > Hi, I have read some tutorials for annotating a newly assembly genome and MAKER2 seems to be a very good and functional pipeline to me. So I am trying to use it to annotate a new assembly mammalian genome that our lab just generated. > But I was stuck while I was trying to install MAKER2 to our slurm HPC server. > I think the pipeline is trying to install several perl packages locally, but one of them called DBD::Pg requires the path of pg_config? > screenshot as below > > But I think our server doesn't have this path and I don't have root to install some other stuff, like libpq-dev or PostgreSQL. > Is that any other methods that can circle around that? > > Any help or advice would be appreciated! > > Thanks, > Shaowen > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Sun Apr 7 19:32:39 2019 From: carsonhh at gmail.com (Carson Holt) Date: Sun, 7 Apr 2019 19:32:39 -0600 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: Message-ID: That?s interesting. It could be a handful of internal filters that help with spurious results. I use a 0.5 sensitivity/specificity to identify shared edges for a jaccardian split on overlapping evidence clusters for example. There are also a couple of places where if the only thing supporting a model is a single exon blastx hit (i.e. no exonerate, ab initio model, or est splice support, but just a chunk od single exon blastx) then maker will use a reading frame aware AED value of 0.5 as a filter (as in it checks if the reading frame matches and not just raw overlap). If that?s the case, the spike near 0.5 may indicate I needed to be a little strickter than my empirical cutoff estimate. Perhaps 0.4 or 0.45 would be the better cuttoff for these spurious blastx induced models. ?Carson > On Apr 7, 2019, at 7:25 AM, Lior Glick wrote: > > Hi MAKER users, > Lately I've been performing annotations for multiple genomes from the same species. > When plotting the histogram of AED scores over all genes, I repeatedly see a very specific pattern, that looks something like this: > > This pattern is a bit surprising to me, in two aspects: > 1) Why is there a surge towards 0.5? > 2) Why is there a sudden drop right after that surge? > > Has anyone else seen this, or is this a specific outcome of my data/configuration? > Any ideas of what may cause such a distribution? > > While this is not necessarily an indication of a problem or bug, it does seem a bit odd, and might imply some bias or artifact. > Would appreciate your comments. > Thank you! > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From xvazquezc at gmail.com Sun Apr 7 22:42:15 2019 From: xvazquezc at gmail.com (=?UTF-8?Q?Xabier_V=C3=A1zquez=2DCampos?=) Date: Mon, 8 Apr 2019 14:42:15 +1000 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: Message-ID: If you train SNAP, the maker2zff script has internal quality cutoffs based on the existence of evidence. e.g. by default it will require having some EST evidence On Mon, 8 Apr 2019 at 11:32, Carson Holt wrote: > That?s interesting. It could be a handful of internal filters that help > with spurious results. > > I use a 0.5 sensitivity/specificity to identify shared edges for a > jaccardian split on overlapping evidence clusters for example. There are > also a couple of places where if the only thing supporting a model is a > single exon blastx hit (i.e. no exonerate, ab initio model, or est splice > support, but just a chunk od single exon blastx) then maker will use a > reading frame aware AED value of 0.5 as a filter (as in it checks if the > reading frame matches and not just raw overlap). If that?s the case, the > spike near 0.5 may indicate I needed to be a little strickter than my > empirical cutoff estimate. Perhaps 0.4 or 0.45 would be the better cuttoff > for these spurious blastx induced models. > > ?Carson > > > > On Apr 7, 2019, at 7:25 AM, Lior Glick wrote: > > > > Hi MAKER users, > > Lately I've been performing annotations for multiple genomes from the > same species. > > When plotting the histogram of AED scores over all genes, I repeatedly > see a very specific pattern, that looks something like this: > > > > This pattern is a bit surprising to me, in two aspects: > > 1) Why is there a surge towards 0.5? > > 2) Why is there a sudden drop right after that surge? > > > > Has anyone else seen this, or is this a specific outcome of my > data/configuration? > > Any ideas of what may cause such a distribution? > > > > While this is not necessarily an indication of a problem or bug, it does > seem a bit odd, and might imply some bias or artifact. > > Would appreciate your comments. > > Thank you! > > _______________________________________________ > > maker-devel mailing list > > maker-devel at box290.bluehost.com > > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -- Xabier V?zquez-Campos, *PhD* *Research Associate* NSW Systems Biology Initiative School of Biotechnology and Biomolecular Sciences The University of New South Wales Sydney NSW 2052 AUSTRALIA -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Sun Apr 7 23:20:24 2019 From: carsonhh at gmail.com (Carson Holt) Date: Sun, 7 Apr 2019 23:20:24 -0600 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: Message-ID: <480A2430-C312-4A43-B659-4694B4F8E61A@gmail.com> Yes. maker2zff tries to further select a subset of the best supported models by requiring multiple forms of evidence support. ?Carson > On Apr 7, 2019, at 10:42 PM, Xabier V?zquez-Campos wrote: > > If you train SNAP, the maker2zff script has internal quality cutoffs based on the existence of evidence. e.g. by default it will require having some EST evidence > > On Mon, 8 Apr 2019 at 11:32, Carson Holt > wrote: > That?s interesting. It could be a handful of internal filters that help with spurious results. > > I use a 0.5 sensitivity/specificity to identify shared edges for a jaccardian split on overlapping evidence clusters for example. There are also a couple of places where if the only thing supporting a model is a single exon blastx hit (i.e. no exonerate, ab initio model, or est splice support, but just a chunk od single exon blastx) then maker will use a reading frame aware AED value of 0.5 as a filter (as in it checks if the reading frame matches and not just raw overlap). If that?s the case, the spike near 0.5 may indicate I needed to be a little strickter than my empirical cutoff estimate. Perhaps 0.4 or 0.45 would be the better cuttoff for these spurious blastx induced models. > > ?Carson > > > > On Apr 7, 2019, at 7:25 AM, Lior Glick > wrote: > > > > Hi MAKER users, > > Lately I've been performing annotations for multiple genomes from the same species. > > When plotting the histogram of AED scores over all genes, I repeatedly see a very specific pattern, that looks something like this: > > > > This pattern is a bit surprising to me, in two aspects: > > 1) Why is there a surge towards 0.5? > > 2) Why is there a sudden drop right after that surge? > > > > Has anyone else seen this, or is this a specific outcome of my data/configuration? > > Any ideas of what may cause such a distribution? > > > > While this is not necessarily an indication of a problem or bug, it does seem a bit odd, and might imply some bias or artifact. > > Would appreciate your comments. > > Thank you! > > _______________________________________________ > > maker-devel mailing list > > maker-devel at box290.bluehost.com > > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > -- > Xabier V?zquez-Campos, PhD > Research Associate > NSW Systems Biology Initiative > School of Biotechnology and Biomolecular Sciences > The University of New South Wales > Sydney NSW 2052 AUSTRALIA -------------- next part -------------- An HTML attachment was scrubbed... URL: From liorglic at mail.tau.ac.il Mon Apr 8 00:54:06 2019 From: liorglic at mail.tau.ac.il (Lior Glick) Date: Mon, 8 Apr 2019 09:54:06 +0300 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: <480A2430-C312-4A43-B659-4694B4F8E61A@gmail.com> References: <480A2430-C312-4A43-B659-4694B4F8E61A@gmail.com> Message-ID: Hello again and thank you all for your interesting answers. I mistakenly answered Mark yesterday from an unsubscribed mail, which resulted in only him getting it, so for documentation sake, I'm posting my answer here again, and Mark's reply: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Dear Mark, Thank you for the quick reply. I'm happy to see this ignites your interest and am willing to endure your punishing questions (; Before I answer them, I just want to make sure we're on the same page - as far as I understand, lower AED scores indicate higher agreement with the evidence, so the "good stuff" is actually left of the 0.5 surge. Am I correct? Otherwise, this is a very poor annotation... Now for the questions: 1) I did not make any filtrations so far, so single exon genes are included as well. in fact, I'm exploring the results in order to develop some criteria for filtering the genes. Would you suggest discarding single exon genes? 2) My evidence consist of assembled transcripts, proteins and predicted gene models (pred_gff). 3) As for repeats, I'm masking based on a repeats library obtained from a previous publication, specific to my organism of interest. Unfortunately, I didn't understand your final question. Could you please explain what you mean by "final build"? Hope these answers are helpful, and waiting to hear more thoughts. Thanks again. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *To which Mark replied:* Sorry. I?m dyslexic, especially early in the morning. Yes, good stuff is on the left. As regards single exon genes, that?s always a hard call, as these have a higher false positive rate. Things to consider are how prevalent are introns in your org? Cason can give more advice on this point, I?m sure. ? ? By ?"final build", I meant is this using the ?Standard build? or ?Max Build? protocol from PMC4286374? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Mark - well, as I said I haven't done any filtration yet, so I guess my annotation currently includes genes that would be discarded even with the "max build". I'll give this a try and look at the resulting distribution. Xabier - thanks, but I'm not using SNAP (just Augustus). Carson - I see a few fingers pointing in the direction of single-exon models, so maybe I should see what happens to the distribution of AED when these genes are removed. I'll get back to you with some more results. ??????? ??? ??, 8 ????? 2019 ?-8:20 ??? ?Carson Holt?? :? > Yes. maker2zff tries to further select a subset of the best supported > models by requiring multiple forms of evidence support. > > ?Carson > > > On Apr 7, 2019, at 10:42 PM, Xabier V?zquez-Campos > wrote: > > If you train SNAP, the maker2zff script has internal quality cutoffs based > on the existence of evidence. e.g. by default it will require having some > EST evidence > > On Mon, 8 Apr 2019 at 11:32, Carson Holt wrote: > >> That?s interesting. It could be a handful of internal filters that help >> with spurious results. >> >> I use a 0.5 sensitivity/specificity to identify shared edges for a >> jaccardian split on overlapping evidence clusters for example. There are >> also a couple of places where if the only thing supporting a model is a >> single exon blastx hit (i.e. no exonerate, ab initio model, or est splice >> support, but just a chunk od single exon blastx) then maker will use a >> reading frame aware AED value of 0.5 as a filter (as in it checks if the >> reading frame matches and not just raw overlap). If that?s the case, the >> spike near 0.5 may indicate I needed to be a little strickter than my >> empirical cutoff estimate. Perhaps 0.4 or 0.45 would be the better cuttoff >> for these spurious blastx induced models. >> >> ?Carson >> >> >> > On Apr 7, 2019, at 7:25 AM, Lior Glick wrote: >> > >> > Hi MAKER users, >> > Lately I've been performing annotations for multiple genomes from the >> same species. >> > When plotting the histogram of AED scores over all genes, I repeatedly >> see a very specific pattern, that looks something like this: >> > >> > This pattern is a bit surprising to me, in two aspects: >> > 1) Why is there a surge towards 0.5? >> > 2) Why is there a sudden drop right after that surge? >> > >> > Has anyone else seen this, or is this a specific outcome of my >> data/configuration? >> > Any ideas of what may cause such a distribution? >> > >> > While this is not necessarily an indication of a problem or bug, it >> does seem a bit odd, and might imply some bias or artifact. >> > Would appreciate your comments. >> > Thank you! >> > _______________________________________________ >> > maker-devel mailing list >> > maker-devel at box290.bluehost.com >> > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > > -- > Xabier V?zquez-Campos, *PhD* > *Research Associate* > NSW Systems Biology Initiative > School of Biotechnology and Biomolecular Sciences > The University of New South Wales > Sydney NSW 2052 AUSTRALIA > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From liorglic at mail.tau.ac.il Mon Apr 8 03:10:15 2019 From: liorglic at mail.tau.ac.il (Lior Glick) Date: Mon, 8 Apr 2019 12:10:15 +0300 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: <480A2430-C312-4A43-B659-4694B4F8E61A@gmail.com> Message-ID: Hi again - quick update: I made a plot comparing the histograms of single-exon genes to multi-exon genes: [image: newplot (5).png] It definitely looks like single-exon genes are *enriched* for the 0.5 score, but it does not account for the entire surge, as there also seem to be lots of multi-exon genes involved. This may suggest that the 0.5 peak is a result of multiple effects buried within the software. Any other thoughts/suggestions? Thanks again, -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: newplot (5).png Type: image/png Size: 18037 bytes Desc: not available URL: From carsonhh at gmail.com Mon Apr 8 10:48:42 2019 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 8 Apr 2019 10:48:42 -0600 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: <480A2430-C312-4A43-B659-4694B4F8E61A@gmail.com> Message-ID: One note. When I say single exon blastx hit, I mean that the evidence is single exon, not that the gene model is single exon. What I think you are seeing is an effect that seems to be partially related to under-masking, i.e. a spurious partial blastx alignment to a low complexity repeat (which is why the blastx protein alignment refuses to polish with exonerate). That is why the filter was added. So if a model (single or multi-exon) has no additional ab initio prediction support, has no EST support, and has no exonerate polished protein support, but does have a single-exon/single-hsp blastx overlap it gets filtered out at 0.5 (that threshold based on trial and error on a couple of genomes where we saw this occur - but your graph suggests that filter might be too loose and 0.4 or 0.45 might be a better value). So the spike is caused by poor blastx and under-masking (this may be explained if your are using in pred_gff models that were generated on an unmasked assembly outside of MAKER), then the drop around 0.5 is caused by MAKER filtering out models only supported by what appears to be spuious blastx alignments. ?Carson > On Apr 8, 2019, at 3:10 AM, Lior Glick wrote: > > Hi again - quick update: > I made a plot comparing the histograms of single-exon genes to multi-exon genes: > > It definitely looks like single-exon genes are enriched for the 0.5 score, but it does not account for the entire surge, as there also seem to be lots of multi-exon genes involved. This may suggest that the 0.5 peak is a result of multiple effects buried within the software. > Any other thoughts/suggestions? > > Thanks again, > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Apr 8 10:51:55 2019 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 8 Apr 2019 10:51:55 -0600 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: <480A2430-C312-4A43-B659-4694B4F8E61A@gmail.com> Message-ID: <75B7E2C9-2D1B-452F-BEED-704289C881ED@gmail.com> Try also adding 2 exon models to the graph. It would be interesting to see if these are attempted single-exon models where the predictor added a micro-intron to keep the open reading frame going against a single exon blastx hint. ?Carson > On Apr 8, 2019, at 3:10 AM, Lior Glick wrote: > > Hi again - quick update: > I made a plot comparing the histograms of single-exon genes to multi-exon genes: > > It definitely looks like single-exon genes are enriched for the 0.5 score, but it does not account for the entire surge, as there also seem to be lots of multi-exon genes involved. This may suggest that the 0.5 peak is a result of multiple effects buried within the software. > Any other thoughts/suggestions? > > Thanks again, > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ying.hu at ufl.edu Wed Apr 17 09:20:00 2019 From: ying.hu at ufl.edu (Hu,Ying) Date: Wed, 17 Apr 2019 15:20:00 +0000 Subject: [maker-devel] maker exons number Message-ID: Hi, Carson, I am using MAKER 2.31.6 to annotate a genome. I noticed that exon number in each gene does not start from 1. Can you give me some suggestions how to change the exon number to 1,2,3 .. In each gene? Thansks, Ying Here are some examples: tig00000226|arrow . contig 1 43850 . . . ID=tig00000226|arrow;Name=tig00000226|arrow tig00000226|arrow maker gene 26339 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0;Name=augustus_masked-tig00000226|arrow-processed-gene-0.0 tig00000226|arrow maker mRNA 26339 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0;Name=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1;_AED=0.47;_eAED=0.68;_QI=0|0|0|0.75|1|1|4|0|82 tig00000226|arrow maker exon 26339 26353 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:58;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker exon 27490 27636 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:59;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker exon 27738 27808 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:60;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker exon 27900 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:61;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 26339 26353 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 27490 27636 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 27738 27808 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 27900 27915 . + 1 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker gene 5803 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1;Name=maker-tig00000226|arrow-augustus-gene-0.1 tig00000226|arrow maker mRNA 5803 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1;Parent=maker-tig00000226|arrow-augustus-gene-0.1;Name=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1;_AED=0.63;_eAED=0.69;_QI=0|0|0|1|0|0|2|0|85 tig00000226|arrow maker exon 5803 5975 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:exon:62;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00000226|arrow maker exon 6505 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:exon:63;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00000226|arrow maker CDS 5803 5975 . + 0 ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00000226|arrow maker CDS 6505 6589 . + 1 ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow . contig 1 104941 . . . ID=tig00034405|arrow;Name=tig00034405|arrow tig00034405|arrow maker gene 40927 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0;Name=maker-tig00034405|arrow-augustus-gene-0.0 tig00034405|arrow maker mRNA 40927 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1;Parent=maker-tig00034405|arrow-augustus-gene-0.0;Name=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1;_AED=0.04;_eAED=0.04;_QI=266|1|1|1|0|0|3|0|100 tig00034405|arrow maker exon 40927 41273 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7157;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker exon 41476 41622 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7158;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker exon 50954 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7159;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker five_prime_UTR 40927 41192 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:five_prime_utr;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker CDS 41193 41273 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker CDS 41476 41622 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker CDS 50954 51025 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker gene 57931 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2;Name=augustus_masked-tig00034405|arrow-processed-gene-0.2 tig00034405|arrow maker mRNA 57931 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2;Name=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1;_AED=0.13;_eAED=0.13;_QI=0|0.5|0|0.66|1|1|3|0|522 tig00034405|arrow maker exon 57931 58962 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7160;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker exon 59245 59725 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7161;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker exon 61510 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7162;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker CDS 57931 58962 . + 0 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker CDS 59245 59725 . + 0 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker CDS 61510 61565 . + 2 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker gene 90355 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1;Name=maker-tig00034405|arrow-augustus-gene-0.1 tig00034405|arrow maker mRNA 90355 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1;Parent=maker-tig00034405|arrow-augustus-gene-0.1;Name=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1;_AED=0.28;_eAED=0.28;_QI=0|0.7|0.72|1|1|1|11|386|425 tig00034405|arrow maker exon 90355 90911 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7173;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 91011 91086 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7172;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 91179 91240 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7171;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 91557 91706 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7170;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 92996 93064 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7169;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93156 93347 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7168;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93453 93637 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7167;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93711 93786 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7166;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93866 93972 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7165;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 94536 94573 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7164;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 94645 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7163;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 94645 94796 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 94536 94573 . - 1 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93866 93972 . - 2 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93711 93786 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93453 93637 . - 2 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93156 93347 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 92996 93064 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 91557 91706 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 91179 91240 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 91011 91086 . - 1 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 90741 90911 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker three_prime_UTR 90355 90740 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:three_prime_utr;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From huyingwin at gmail.com Wed Apr 17 09:23:10 2019 From: huyingwin at gmail.com (YING HU) Date: Wed, 17 Apr 2019 11:23:10 -0400 Subject: [maker-devel] maker exon number Message-ID: Hi, Carson, I am using MAKER 2.31.6 to annotate a genome. I noticed that exon number in each gene does not start from 1. Can you give me some suggestions how to change the exon number to 1,2,3 .. In each gene? Thansks, Ying Here are some examples: tig00000226|arrow . contig 1 43850 . . . ID=tig00000226|arrow;Name=tig00000226|arrow tig00000226|arrow maker gene 26339 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0;Name=augustus_masked-tig00000226|arrow-processed-gene-0.0 tig00000226|arrow maker mRNA 26339 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0;Name=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1;_AED=0.47;_eAED=0.68;_QI=0|0|0|0.75|1|1|4|0|82 tig00000226|arrow maker exon 26339 26353 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:58;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker exon 27490 27636 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:59;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker exon 27738 27808 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:60;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker exon 27900 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:61;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 26339 26353 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 27490 27636 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 27738 27808 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 27900 27915 . + 1 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker gene 5803 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1;Name=maker-tig00000226|arrow-augustus-gene-0.1 tig00000226|arrow maker mRNA 5803 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1;Parent=maker-tig00000226|arrow-augustus-gene-0.1;Name=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1;_AED=0.63;_eAED=0.69;_QI=0|0|0|1|0|0|2|0|85 tig00000226|arrow maker exon 5803 5975 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:exon:62;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00000226|arrow maker exon 6505 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:exon:63;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00000226|arrow maker CDS 5803 5975 . + 0 ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00000226|arrow maker CDS 6505 6589 . + 1 ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow . contig 1 104941 . . . ID=tig00034405|arrow;Name=tig00034405|arrow tig00034405|arrow maker gene 40927 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0;Name=maker-tig00034405|arrow-augustus-gene-0.0 tig00034405|arrow maker mRNA 40927 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1;Parent=maker-tig00034405|arrow-augustus-gene-0.0;Name=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1;_AED=0.04;_eAED=0.04;_QI=266|1|1|1|0|0|3|0|100 tig00034405|arrow maker exon 40927 41273 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7157;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker exon 41476 41622 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7158;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker exon 50954 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7159;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker five_prime_UTR 40927 41192 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:five_prime_utr;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker CDS 41193 41273 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker CDS 41476 41622 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker CDS 50954 51025 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker gene 57931 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2;Name=augustus_masked-tig00034405|arrow-processed-gene-0.2 tig00034405|arrow maker mRNA 57931 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2;Name=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1;_AED=0.13;_eAED=0.13;_QI=0|0.5|0|0.66|1|1|3|0|522 tig00034405|arrow maker exon 57931 58962 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7160;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker exon 59245 59725 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7161;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker exon 61510 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7162;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker CDS 57931 58962 . + 0 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker CDS 59245 59725 . + 0 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker CDS 61510 61565 . + 2 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker gene 90355 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1;Name=maker-tig00034405|arrow-augustus-gene-0.1 tig00034405|arrow maker mRNA 90355 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1;Parent=maker-tig00034405|arrow-augustus-gene-0.1;Name=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1;_AED=0.28;_eAED=0.28;_QI=0|0.7|0.72|1|1|1|11|386|425 tig00034405|arrow maker exon 90355 90911 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7173;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 91011 91086 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7172;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 91179 91240 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7171;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 91557 91706 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7170;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 92996 93064 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7169;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93156 93347 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7168;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93453 93637 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7167;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93711 93786 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7166;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93866 93972 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7165;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 94536 94573 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7164;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 94645 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7163;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 94645 94796 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 94536 94573 . - 1 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93866 93972 . - 2 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93711 93786 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93453 93637 . - 2 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93156 93347 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 92996 93064 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 91557 91706 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 91179 91240 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 91011 91086 . - 1 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 90741 90911 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker three_prime_UTR 90355 90740 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:three_prime_utr;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Apr 17 13:43:41 2019 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 17 Apr 2019 13:43:41 -0600 Subject: [maker-devel] maker exon number In-Reply-To: References: Message-ID: The ID= value is simply a unique value to resolve inheritance in conjunction with Parent=. It has no biological meaning. Also with gff3 format, to reduce redundancy, a single ?exon? features can be the child of multiple mRNA features, so a single ?exon' line can be the first exon in one transcript but the second exon in another. ?Carson > On Apr 17, 2019, at 9:23 AM, YING HU wrote: > > Hi, Carson, > > I am using MAKER 2.31.6 to annotate a genome. I noticed that exon number in each gene does not start from 1. Can you give me some suggestions how to change the exon number to 1,2,3 .. In each gene? Thansks, > > Ying > > Here are some examples: > > tig00000226|arrow . contig 1 43850 . . . ID=tig00000226|arrow;Name=tig00000226|arrow > tig00000226|arrow maker gene 26339 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0;Name=augustus_masked-tig00000226|arrow-processed-gene-0.0 > tig00000226|arrow maker mRNA 26339 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0;Name=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1;_AED=0.47;_eAED=0.68;_QI=0|0|0|0.75|1|1|4|0|82 > tig00000226|arrow maker exon 26339 26353 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:58;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker exon 27490 27636 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:59;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker exon 27738 27808 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:60;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker exon 27900 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:61;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker CDS 26339 26353 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker CDS 27490 27636 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker CDS 27738 27808 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker CDS 27900 27915 . + 1 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker gene 5803 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1;Name=maker-tig00000226|arrow-augustus-gene-0.1 > tig00000226|arrow maker mRNA 5803 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1;Parent=maker-tig00000226|arrow-augustus-gene-0.1;Name=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1;_AED=0.63;_eAED=0.69;_QI=0|0|0|1|0|0|2|0|85 > tig00000226|arrow maker exon 5803 5975 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:exon:62;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 > tig00000226|arrow maker exon 6505 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:exon:63;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 > tig00000226|arrow maker CDS 5803 5975 . + 0 ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 > > tig00000226|arrow maker CDS 6505 6589 . + 1 ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 > > tig00034405|arrow . contig 1 104941 . . . ID=tig00034405|arrow;Name=tig00034405|arrow > tig00034405|arrow maker gene 40927 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0;Name=maker-tig00034405|arrow-augustus-gene-0.0 > tig00034405|arrow maker mRNA 40927 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1;Parent=maker-tig00034405|arrow-augustus-gene-0.0;Name=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1;_AED=0.04;_eAED=0.04;_QI=266|1|1|1|0|0|3|0|100 > tig00034405|arrow maker exon 40927 41273 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7157;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 > tig00034405|arrow maker exon 41476 41622 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7158;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 > tig00034405|arrow maker exon 50954 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7159;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 > tig00034405|arrow maker five_prime_UTR 40927 41192 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:five_prime_utr;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 > tig00034405|arrow maker CDS 41193 41273 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 > tig00034405|arrow maker CDS 41476 41622 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 > tig00034405|arrow maker CDS 50954 51025 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 > tig00034405|arrow maker gene 57931 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2;Name=augustus_masked-tig00034405|arrow-processed-gene-0.2 > tig00034405|arrow maker mRNA 57931 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2;Name=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1;_AED=0.13;_eAED=0.13;_QI=0|0.5|0|0.66|1|1|3|0|522 > tig00034405|arrow maker exon 57931 58962 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7160;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 > tig00034405|arrow maker exon 59245 59725 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7161;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 > tig00034405|arrow maker exon 61510 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7162;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 > tig00034405|arrow maker CDS 57931 58962 . + 0 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 > tig00034405|arrow maker CDS 59245 59725 . + 0 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 > tig00034405|arrow maker CDS 61510 61565 . + 2 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 > tig00034405|arrow maker gene 90355 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1;Name=maker-tig00034405|arrow-augustus-gene-0.1 > tig00034405|arrow maker mRNA 90355 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1;Parent=maker-tig00034405|arrow-augustus-gene-0.1;Name=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1;_AED=0.28;_eAED=0.28;_QI=0|0.7|0.72|1|1|1|11|386|425 > tig00034405|arrow maker exon 90355 90911 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7173;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 91011 91086 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7172;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 91179 91240 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7171;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 91557 91706 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7170;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 92996 93064 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7169;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 93156 93347 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7168;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 93453 93637 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7167;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 93711 93786 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7166;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 93866 93972 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7165;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 94536 94573 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7164;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 94645 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7163;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 94645 94796 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 94536 94573 . - 1 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 93866 93972 . - 2 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 93711 93786 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 93453 93637 . - 2 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 93156 93347 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 92996 93064 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 91557 91706 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 91179 91240 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 91011 91086 . - 1 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 90741 90911 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > > tig00034405|arrow maker three_prime_UTR 90355 90740 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:three_prime_utr;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul at tupac.bio Thu Apr 18 03:23:35 2019 From: paul at tupac.bio (Paul Sheridan) Date: Thu, 18 Apr 2019 18:23:35 +0900 Subject: [maker-devel] maker_functional_gff Error Message-ID: Dear MAKER Team, I am running MAKER 2.31.10 a 32 core instance. I followed the Post Processing of Annotations steps as described in the MAKER Tutorial for GMOD Online Training 2014 as best I could, but I get an error when I run maker_functional_gff. The commands in the order of execution and relevant output are shown below. Where did I do wrong? # run blastp command blastp -query genome.all.maker.proteins.fasta -db uniprot_sprot.fasta -num_threads 32 -evalue 1e-6 -max_hsps 1 -max_target_seqs 1 -outfmt 6 -out output.blastp # run interproscan command interproscan.sh -appl pfam -dp -f TSV -goterms -iprlookup -pa -t p -i genome.all.maker.proteins.fasta -o output.iprscan # create naming table maker_map_ids --prefix ThuMac --justify 5 genome.all.gff > genome.all.map # copy files for safe keeping cp genome.all.gff genome.all.renamed.gff cp genome.all.noseq.gff genome.all.noseq.renamed.gff cp genome.all.maker.proteins.fasta genome.all.maker.proteins.renamed.fasta cp genome.all.maker.proteins.aed.0.50.fasta genome.all.maker.proteins.aed.0.50.renamed.fasta cp genome.all.maker.unique.proteins.aed.0.50.fasta genome.all.maker.unique.proteins.aed.0.50.renamed.fasta cp genome.all.maker.transcripts.fasta genome.all.maker.transcripts.renamed.fasta cp genome.all.maker.transcripts.aed.0.50.fasta genome.all.maker.transcripts.aed.0.50.renamed.fasta cp output.iprscan output.renamed.iprscan cp output.blastp output.renamed.blastp # replace uninformative MAKER protein/transcript names with useful ones map_gff_ids genome.all.map genome.all.renamed.gff map_gff_ids genome.all.map genome.all.noseq.renamed.gff map_fasta_ids genome.all.map genome.all.maker.proteins.renamed.fasta map_fasta_ids genome.all.map genome.all.maker.proteins.aed.0.50.renamed.fasta map_fasta_ids genome.all.map genome.all.maker.unique.proteins.aed.0.50.renamed.fasta map_fasta_ids genome.all.map genome.all.maker.transcripts.renamed.fasta map_fasta_ids genome.all.map genome.all.maker.transcripts.aed.0.50.renamed.fasta map_data_ids genome.all.map output.renamed.iprscan map_data_ids genome.all.map output.renamed.blastp # assign annotations maker_functional_gff uniprot_sprot.db output.renamed.blastp genome.all.renamed.gff > genome.all.renamed.putative_function.gff > head output.renamed.blastp ThuMac30929-RA P20036 41.791 134 77 1 326 458 113 246 9.51e-28 114 ThuMac19623-RA P81018 35.714 168 87 2 1 147 1 168 8.40e-33 117 ThuMac19629-RA Q66I51 68.939 264 79 2 1 263 1 262 1.48e-130 372 ThuMac19628-RA Q61464 55.172 87 37 1 766 852 382 466 4.42e-25 119 ThuMac19627-RA P07898 48.276 58 29 1 13 69 1962 2019 3.60e-13 65.9 ThuMac19626-RA P81018 36.782 174 96 2 21 180 1 174 5.75e-36 127 ThuMac19624-RA P81018 35.057 174 99 2 21 180 1 174 2.19e-33 120 ThuMac19625-RA Q28343 32.520 123 43 2 35 117 2123 2245 7.57e-17 78.6 ThuMac19636-RA Q9QX29 90.909 110 10 0 5 114 458 567 6.45e-65 216 ThuMac19638-RA Q9QX29 57.391 115 35 3 5 114 703 808 3.06e-28 120 > head output.renamed.iprscan ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF00520 Ion transport protein 154 413 3.8E-21 T 18-04-2019 IPR005821 Ion transport domain GO:0005216|GO:0006811|GO:0016020|GO:0055085 ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF08412 Ion transport protein N-terminal 109 152 5.1E-19 T 18-04-2019 IPR013621 Ion transport N-terminal Reactome: R-HSA-1296061 ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF00027 Cyclic nucleotide-binding domain 519 601 1.0E-17 T 18-04-2019 IPR000595 Cyclic nucleotide-binding domain ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF13765 SPRY-associated domain 235 283 8.9E-23 T 18-04-2019 IPR006574 SPRY-associated ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF00643 B-box zinc finger 18 56 5.2E-12 T 18-04-2019 IPR000315 B-box-type zinc finger GO:0008270 ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF00622 SPRY domain 287 391 2.2E-14 T 18-04-2019 IPR003877 SPRY domain GO:0005515 ThuMac08369-RA 7aee1da5a47975ab8e43b68bfd1a117c 139 Pfam PF00076 RNA recognition motif. (a.k.a. RRM, RBD, or RNP domain) 22 87 1.6E-15 T 18-04-2019 IPR000504 RNA recognition motif domain GO:0003676 ThuMac26054-RA 8f4119609312bd6442f8bb094c104231 462 Pfam PF07565 Band 3 cytoplasmic domain 173 443 7.3E-100 T 18-04-2019 IPR013769 Band 3 cytoplasmic domain GO:0006820|GO:0008509|GO:0016021 Reactome: R-HSA-425381 ThuMac07958-RA d2b749fa573a5e452cadee56090c9588 804 Pfam PF03372 Endonuclease/Exonuclease/phosphatase family 235 535 7.0E-11 T 18-04-2019 IPR005135 Endonuclease/exonuclease/phosphatase ThuMac07958-RA d2b749fa573a5e452cadee56090c9588 804 Pfam PF17751 SKICH domain 555 649 9.8E-23 T 18-04-2019 IPR041611 SKICH domain > map_data_ids genome.all.map output.renamed.iprscan WARNING: No mapping available for ThuMac01937-RA WARNING: No mapping available for ThuMac02226-RA WARNING: No mapping available for ThuMac20730-RA WARNING: No mapping available for ThuMac20730-RA WARNING: No mapping available for ThuMac14750-RA (Thousands of warnings like these were returned) > maker_functional_gff uniprot_sprot.db output.renamed.blastp genome.all.renamed.gff > genome.all.renamed.putative_function.gff Can't use string ("") as a HASH ref while "strict refs" in use at /root/maker/bin/maker_functional_gff line 55, <$IN> line 3. > head genome.all.renamed.putative_function.gff ##gff-version 3 scf7180000008677_pilon_pilon . contig 1 49996 . . . ID=scf7180000008677_pilon_pilon;Name=scf7180000008677_pilon_pilon Thanks in Advance, Paul Sheridan -- CSO at Tupac Bio Email: paul at tupac.bio Homepage: www.paulsheridan.net Mobile: +81 80 7889 0859 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Apr 22 11:50:27 2019 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 22 Apr 2019 11:50:27 -0600 Subject: [maker-devel] maker_functional_gff Error In-Reply-To: References: Message-ID: <4AE21A4F-77F5-4DD8-8614-0D037F7C5209@gmail.com> This ?WARNING: No mapping available for ThuMac01937-RA? means you are running on a file that already has been renamed. The file will have names like maker-SDFGDG-gene-0.1-mRNA-1 for example, and it?s finding the name ThuMac01937-RA, which is not in the first column of the map file. So it throws a warning. The second one ?> Can't use string ("") as a HASH ref while "strict refs" in use at /root/maker/bin/maker_functional_gff line 55, <$IN> line 3. You likely have a trucated line in the GFF3. It?s missing an ID= tag. This can sometimes happen if writing to network mounted (NFS) file systems because of an ansyncrounous IO error. NFS file systems have a performance enhancement where they return SUCCESS on IO operations even and then complete the IO operation later in the background. This improves speed by letting the program advance by not blocking for the IO operation, but it reduces reliability because if the later operation is not really successful, it can?t go back and tell the program ?never mind it failed.? The result is a silent truncation of data. Not super common, but not all that rare either depending on IO load (i.e. heavy MPI with lots of writes). Find the line that?s truncated, then rerun just that contig before building the merged gff3 for everything. ?Carson > On Apr 18, 2019, at 3:23 AM, Paul Sheridan wrote: > > Dear MAKER Team, > > I am running MAKER 2.31.10 a 32 core instance. I followed the Post Processing of Annotations steps as described in the MAKER Tutorial for GMOD Online Training 2014 as best I could, but I get an error when I run maker_functional_gff. The commands in the order of execution and relevant output are shown below. > > Where did I do wrong? > > # run blastp command > blastp -query genome.all.maker.proteins.fasta -db uniprot_sprot.fasta -num_threads 32 -evalue 1e-6 -max_hsps 1 -max_target_seqs 1 -outfmt 6 -out output.blastp > > # run interproscan command > interproscan.sh -appl pfam -dp -f TSV -goterms -iprlookup -pa -t p -i genome.all.maker.proteins.fasta -o output.iprscan > > # create naming table > maker_map_ids --prefix ThuMac --justify 5 genome.all.gff > genome.all.map > > # copy files for safe keeping > cp genome.all.gff genome.all.renamed.gff > cp genome.all.noseq.gff genome.all.noseq.renamed.gff > cp genome.all.maker.proteins.fasta genome.all.maker.proteins.renamed.fasta > cp genome.all.maker.proteins.aed.0.50.fasta genome.all.maker.proteins.aed.0.50.renamed.fasta > cp genome.all.maker.unique.proteins.aed.0.50.fasta genome.all.maker.unique.proteins.aed.0.50.renamed.fasta > cp genome.all.maker.transcripts.fasta genome.all.maker.transcripts.renamed.fasta > cp genome.all.maker.transcripts.aed.0.50.fasta genome.all.maker.transcripts.aed.0.50.renamed.fasta > cp output.iprscan output.renamed.iprscan > cp output.blastp output.renamed.blastp > > # replace uninformative MAKER protein/transcript names with useful ones > map_gff_ids genome.all.map genome.all.renamed.gff > map_gff_ids genome.all.map genome.all.noseq.renamed.gff > map_fasta_ids genome.all.map genome.all.maker.proteins.renamed.fasta > map_fasta_ids genome.all.map genome.all.maker.proteins.aed.0.50.renamed.fasta > map_fasta_ids genome.all.map genome.all.maker.unique.proteins.aed.0.50.renamed.fasta > map_fasta_ids genome.all.map genome.all.maker.transcripts.renamed.fasta > map_fasta_ids genome.all.map genome.all.maker.transcripts.aed.0.50.renamed.fasta > map_data_ids genome.all.map output.renamed.iprscan > map_data_ids genome.all.map output.renamed.blastp > > # assign annotations > maker_functional_gff uniprot_sprot.db output.renamed.blastp genome.all.renamed.gff > genome.all.renamed.putative_function.gff > > > head output.renamed.blastp > ThuMac30929-RA P20036 41.791 134 77 1 326 458 113 246 9.51e-28 114 > ThuMac19623-RA P81018 35.714 168 87 2 1 147 1 168 8.40e-33 117 > ThuMac19629-RA Q66I51 68.939 264 79 2 1 263 1 262 1.48e-130 372 > ThuMac19628-RA Q61464 55.172 87 37 1 766 852 382 466 4.42e-25 119 > ThuMac19627-RA P07898 48.276 58 29 1 13 69 1962 2019 3.60e-13 65.9 > ThuMac19626-RA P81018 36.782 174 96 2 21 180 1 174 5.75e-36 127 > ThuMac19624-RA P81018 35.057 174 99 2 21 180 1 174 2.19e-33 120 > ThuMac19625-RA Q28343 32.520 123 43 2 35 117 2123 2245 7.57e-17 78.6 > ThuMac19636-RA Q9QX29 90.909 110 10 0 5 114 458 567 6.45e-65 216 > ThuMac19638-RA Q9QX29 57.391 115 35 3 5 114 703 808 3.06e-28 120 > > > head output.renamed.iprscan > ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF00520 Ion transport protein 154 413 3.8E-21 T 18-04-2019 IPR005821 Ion transport domain GO:0005216|GO:0006811|GO:0016020|GO:0055085 > ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF08412 Ion transport protein N-terminal 109 152 5.1E-19 T 18-04-2019 IPR013621 Ion transport N-terminal Reactome: R-HSA-1296061 > ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF00027 Cyclic nucleotide-binding domain 519 601 1.0E-17 T 18-04-2019 IPR000595 Cyclic nucleotide-binding domain > ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF13765 SPRY-associated domain 235 283 8.9E-23 T 18-04-2019 IPR006574 SPRY-associated > ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF00643 B-box zinc finger 18 56 5.2E-12 T 18-04-2019 IPR000315 B-box-type zinc finger GO:0008270 > ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF00622 SPRY domain 287 391 2.2E-14 T 18-04-2019 IPR003877 SPRY domain GO:0005515 > ThuMac08369-RA 7aee1da5a47975ab8e43b68bfd1a117c 139 Pfam PF00076 RNA recognition motif. (a.k.a. RRM, RBD, or RNP domain) 22 87 1.6E-15 T 18-04-2019 IPR000504 RNA recognition motif domain GO:0003676 > ThuMac26054-RA 8f4119609312bd6442f8bb094c104231 462 Pfam PF07565 Band 3 cytoplasmic domain 173 443 7.3E-100 T 18-04-2019 IPR013769 Band 3 cytoplasmic domain GO:0006820|GO:0008509|GO:0016021 Reactome: R-HSA-425381 > ThuMac07958-RA d2b749fa573a5e452cadee56090c9588 804 Pfam PF03372 Endonuclease/Exonuclease/phosphatase family 235 535 7.0E-11 T 18-04-2019 IPR005135 Endonuclease/exonuclease/phosphatase > ThuMac07958-RA d2b749fa573a5e452cadee56090c9588 804 Pfam PF17751 SKICH domain 555 649 9.8E-23 T 18-04-2019 IPR041611 SKICH domain > > > map_data_ids genome.all.map output.renamed.iprscan > WARNING: No mapping available for ThuMac01937-RA > WARNING: No mapping available for ThuMac02226-RA > WARNING: No mapping available for ThuMac20730-RA > WARNING: No mapping available for ThuMac20730-RA > WARNING: No mapping available for ThuMac14750-RA > (Thousands of warnings like these were returned) > > > maker_functional_gff uniprot_sprot.db output.renamed.blastp genome.all.renamed.gff > genome.all.renamed.putative_function.gff > Can't use string ("") as a HASH ref while "strict refs" in use at /root/maker/bin/maker_functional_gff line 55, <$IN> line 3. > > > head genome.all.renamed.putative_function.gff > ##gff-version 3 > scf7180000008677_pilon_pilon . contig 1 49996 . . . ID=scf7180000008677_pilon_pilon;Name=scf7180000008677_pilon_pilon > > Thanks in Advance, > > Paul Sheridan > > -- > CSO at Tupac Bio > Email: paul at tupac.bio > Homepage: www.paulsheridan.net > Mobile: +81 80 7889 0859 > _______________________________________________ > maker-devel mailing list > maker-devel at yandell-lab.org > http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul at tupac.bio Sun Apr 28 19:40:12 2019 From: paul at tupac.bio (Paul Sheridan) Date: Mon, 29 Apr 2019 10:40:12 +0900 Subject: [maker-devel] maker_functional_gff Error In-Reply-To: <4AE21A4F-77F5-4DD8-8614-0D037F7C5209@gmail.com> References: <4AE21A4F-77F5-4DD8-8614-0D037F7C5209@gmail.com> Message-ID: Hi Carson, Thanks, your suggestions got me sorted out. Best, Paul On Tue, Apr 23, 2019 at 2:50 AM Carson Holt wrote: > This ?WARNING: No mapping available for ThuMac01937-RA? means you are > running on a file that already has been renamed. The file will have names > like maker-SDFGDG-gene-0.1-mRNA-1 for example, and it?s finding the name > ThuMac01937-RA, which is not in the first column of the map file. So it > throws a warning. > > The second one ?> Can't use string ("") as a HASH ref while "strict refs" > in use at /root/maker/bin/maker_functional_gff line 55, <$IN> line 3. > > > You likely have a trucated line in the GFF3. It?s missing an ID= tag. This > can sometimes happen if writing to network mounted (NFS) file systems > because of an ansyncrounous IO error. NFS file systems have a performance > enhancement where they return SUCCESS on IO operations even and then > complete the IO operation later in the background. This improves speed by > letting the program advance by not blocking for the IO operation, but it > reduces reliability because if the later operation is not really > successful, it can?t go back and tell the program ?never mind it failed.? > The result is a silent truncation of data. Not super common, but not all > that rare either depending on IO load (i.e. heavy MPI with lots of writes). > Find the line that?s truncated, then rerun just that contig before building > the merged gff3 for everything. > > ?Carson > > > > On Apr 18, 2019, at 3:23 AM, Paul Sheridan wrote: > > Dear MAKER Team, > > I am running MAKER 2.31.10 a 32 core instance. I followed the Post > Processing of Annotations steps as described in the MAKER Tutorial for GMOD > Online Training 2014 as best I could, but I get an error when I run > maker_functional_gff. The commands in the order of execution and relevant > output are shown below. > > Where did I do wrong? > > # run blastp command > blastp -query genome.all.maker.proteins.fasta -db uniprot_sprot.fasta > -num_threads 32 -evalue 1e-6 -max_hsps 1 -max_target_seqs 1 -outfmt 6 -out > output.blastp > > # run interproscan command > interproscan.sh -appl pfam -dp -f TSV -goterms -iprlookup -pa -t p -i > genome.all.maker.proteins.fasta -o output.iprscan > > # create naming table > maker_map_ids --prefix ThuMac --justify 5 genome.all.gff > genome.all.map > > # copy files for safe keeping > cp genome.all.gff genome.all.renamed.gff > cp genome.all.noseq.gff genome.all.noseq.renamed.gff > cp genome.all.maker.proteins.fasta genome.all.maker.proteins.renamed.fasta > cp genome.all.maker.proteins.aed.0.50.fasta > genome.all.maker.proteins.aed.0.50.renamed.fasta > cp genome.all.maker.unique.proteins.aed.0.50.fasta > genome.all.maker.unique.proteins.aed.0.50.renamed.fasta > cp genome.all.maker.transcripts.fasta > genome.all.maker.transcripts.renamed.fasta > cp genome.all.maker.transcripts.aed.0.50.fasta > genome.all.maker.transcripts.aed.0.50.renamed.fasta > cp output.iprscan output.renamed.iprscan > cp output.blastp output.renamed.blastp > > # replace uninformative MAKER protein/transcript names with useful ones > map_gff_ids genome.all.map genome.all.renamed.gff > map_gff_ids genome.all.map genome.all.noseq.renamed.gff > map_fasta_ids genome.all.map genome.all.maker.proteins.renamed.fasta > map_fasta_ids genome.all.map > genome.all.maker.proteins.aed.0.50.renamed.fasta > map_fasta_ids genome.all.map > genome.all.maker.unique.proteins.aed.0.50.renamed.fasta > map_fasta_ids genome.all.map genome.all.maker.transcripts.renamed.fasta > map_fasta_ids genome.all.map > genome.all.maker.transcripts.aed.0.50.renamed.fasta > map_data_ids genome.all.map output.renamed.iprscan > map_data_ids genome.all.map output.renamed.blastp > > # assign annotations > maker_functional_gff uniprot_sprot.db output.renamed.blastp > genome.all.renamed.gff > genome.all.renamed.putative_function.gff > > > head output.renamed.blastp > ThuMac30929-RA P20036 41.791 134 77 1 326 458 113 246 9.51e-28 114 > ThuMac19623-RA P81018 35.714 168 87 2 1 147 1 168 8.40e-33 117 > ThuMac19629-RA Q66I51 68.939 264 79 2 1 263 1 262 1.48e-130 372 > ThuMac19628-RA Q61464 55.172 87 37 1 766 852 382 466 4.42e-25 119 > ThuMac19627-RA P07898 48.276 58 29 1 13 69 1962 2019 3.60e-13 65.9 > ThuMac19626-RA P81018 36.782 174 96 2 21 180 1 174 5.75e-36 127 > ThuMac19624-RA P81018 35.057 174 99 2 21 180 1 174 2.19e-33 120 > ThuMac19625-RA Q28343 32.520 123 43 2 35 117 2123 2245 7.57e-17 78.6 > ThuMac19636-RA Q9QX29 90.909 110 10 0 5 114 458 567 6.45e-65 216 > ThuMac19638-RA Q9QX29 57.391 115 35 3 5 114 703 808 3.06e-28 120 > > > head output.renamed.iprscan > ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF00520 Ion > transport protein 154 413 3.8E-21 T 18-04-2019 IPR005821 Ion transport > domain GO:0005216|GO:0006811|GO:0016020|GO:0055085 > ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF08412 Ion > transport protein N-terminal 109 152 5.1E-19 T 18-04-2019 IPR013621 Ion > transport N-terminal Reactome: R-HSA-1296061 > ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF00027 Cyclic > nucleotide-binding domain 519 601 1.0E-17 T 18-04-2019 IPR000595 Cyclic > nucleotide-binding domain > ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF13765 SPRY-associated > domain 235 283 8.9E-23 T 18-04-2019 IPR006574 SPRY-associated > ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF00643 B-box > zinc finger 18 56 5.2E-12 T 18-04-2019 IPR000315 B-box-type zinc finger > GO:0008270 > ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF00622 SPRY > domain 287 391 2.2E-14 T 18-04-2019 IPR003877 SPRY domain GO:0005515 > ThuMac08369-RA 7aee1da5a47975ab8e43b68bfd1a117c 139 Pfam PF00076 RNA > recognition motif. (a.k.a. RRM, RBD, or RNP domain) 22 87 1.6E-15 T > 18-04-2019 IPR000504 RNA recognition motif domain GO:0003676 > ThuMac26054-RA 8f4119609312bd6442f8bb094c104231 462 Pfam PF07565 Band 3 > cytoplasmic domain 173 443 7.3E-100 T 18-04-2019 IPR013769 Band 3 > cytoplasmic domain GO:0006820|GO:0008509|GO:0016021 Reactome: R-HSA-425381 > ThuMac07958-RA d2b749fa573a5e452cadee56090c9588 804 Pfam PF03372 Endonuclease/Exonuclease/phosphatase > family 235 535 7.0E-11 T 18-04-2019 IPR005135 > Endonuclease/exonuclease/phosphatase > ThuMac07958-RA d2b749fa573a5e452cadee56090c9588 804 Pfam PF17751 SKICH > domain 555 649 9.8E-23 T 18-04-2019 IPR041611 SKICH domain > > > map_data_ids genome.all.map output.renamed.iprscan > WARNING: No mapping available for ThuMac01937-RA > WARNING: No mapping available for ThuMac02226-RA > WARNING: No mapping available for ThuMac20730-RA > WARNING: No mapping available for ThuMac20730-RA > WARNING: No mapping available for ThuMac14750-RA > (Thousands of warnings like these were returned) > > > maker_functional_gff uniprot_sprot.db output.renamed.blastp > genome.all.renamed.gff > genome.all.renamed.putative_function.gff > Can't use string ("") as a HASH ref while "strict refs" in use at > /root/maker/bin/maker_functional_gff line 55, <$IN> line 3. > > > head genome.all.renamed.putative_function.gff > ##gff-version 3 > scf7180000008677_pilon_pilon . contig 1 49996 . . . > ID=scf7180000008677_pilon_pilon;Name=scf7180000008677_pilon_pilon > > Thanks in Advance, > > Paul Sheridan > > -- > CSO at Tupac Bio > Email: paul at tupac.bio > Homepage: www.paulsheridan.net > Mobile: +81 80 7889 0859 > _______________________________________________ > maker-devel mailing list > maker-devel at yandell-lab.org > http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org > > > -- CSO at Tupac Bio Email: paul at tupac.bio Homepage: www.paulsheridan.net Mobile: +81 80 7889 0859 -------------- next part -------------- An HTML attachment was scrubbed... URL: From bastian.schiffthaler at umu.se Wed Apr 24 01:43:15 2019 From: bastian.schiffthaler at umu.se (Bastian Schiffthaler) Date: Wed, 24 Apr 2019 07:43:15 -0000 Subject: [maker-devel] Redundant FASTA headers Message-ID: <251d38f5-c15a-6070-fcd9-d6144744885e@umu.se> Hi, I'm running the MPI version of MAKER and I'm supplying seven different trinity assemblies (different experiments) as evidence. Now trinity will not generate unique FASTA headers >across< files, so I'm wondering if there could be an issue with ID collision? What does MAKER use the headers for? Could it create race conditions in temp files? Thanks in advance, Bastian From Christian_jpg2 at hotmail.com Tue Apr 30 12:42:31 2019 From: Christian_jpg2 at hotmail.com (Christian Ayala) Date: Tue, 30 Apr 2019 18:42:31 +0000 Subject: [maker-devel] Running out of time in MAKER Message-ID: Good afternoon, I am trying to annotate some insect genomes using MAKER. MAKER is running in a system that uses a PBS scheduler and has a walltime of 120 hours. So , my jobs are running out of time and are killed before MAKER finishes the annotation. Is there a way to resume a killed MAKER run? Thanks for your help. Best regards, Christian Ayala-Ortiz -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul at tupac.bio Thu Apr 4 16:46:12 2019 From: paul at tupac.bio (Paul Sheridan) Date: Fri, 5 Apr 2019 07:46:12 +0900 Subject: [maker-devel] Running SNAP with MAKER Message-ID: Dear MAKER Team, I am running MAKER 2.31.10 a 32 core instance. My first pass completed successfully. However, my second pass using SNAP and Augustus trained ab initio gene predictions failed. Here is some example output which illustrates the problem: MAKER WARNING: Changes in control files make re-use of all old data impossible All old files will be erased before continuing processing all repeats doing repeat masking doing repeat masking #--------------------------------------------------------------------- Now starting the contig!! SeqID: scf7180000008677_pilon_pilon Length: 49996 #--------------------------------------------------------------------- doing repeat masking preparing ab-inits running snap. #--------- command -------------# Widget::snap: /usr/bin/snap /root/tuna-round-2/genome.maker.output/snap/round1/genome.hmm /tmp/maker_8RuX8Z/scf718 0000006915_pilon_pilon.abinit_masked.0 > /tmp/maker_8RuX8Z/scf7180000006915_pilon_pilon.abinit_maske d.0.genome%2Ehmm.snap #-------------------------------# setting up GFF3 output and fasta chunks processing all repeats doing repeat masking in cluster::shadow_cluster... ...finished clustering. error: unknown command "/root/tuna-round-2/genome.maker.output/snap/round1/genome.hmm", see 'snap help' ERROR: Snap failed --> rank=21, hostname=localhost ERROR: Failed while preparing ab-inits ERROR: Chunk failed at level:0, tier_type:2 FAILED CONTIG:scf7180000007575_pilon_pilon I confirmed that the path to genome.hmm is correct. In addition, run.log contains the following kind of output: STARTED genome.maker.output/genome_datastore/00/6E/scf7180000008677_pilon_pilon//theVoid.scf7180000008677_pilon_pilon/scf7180000008677_pilon_pilon.abinit_m asked.0.genome%2Ehmm.snap DIED RANK 30:4:0:0 DIED COUNT 2 DIED RANK 30 DIED COUNT 2 How can I resolve this issue? Also, is the warning about it being impossible to use the old data to be expected? Attached files: - maker_otps1.ctl: first pass control file - maker_opts2.ctl: second pass control file - run.log: log file for an example contig Thanks in Advance, Paul Sheridan -- CSO at Tupac Bio Email: paul at tupac.bio Homepage: www.paulsheridan.net Mobile: +81 80 7889 0859 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts2.ctl Type: application/octet-stream Size: 4728 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: run.log Type: application/octet-stream Size: 2366 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: maker_opts1.ctl Type: application/octet-stream Size: 4515 bytes Desc: not available URL: From carsonhh at gmail.com Sat Apr 6 15:00:14 2019 From: carsonhh at gmail.com (Carson Holt) Date: Sat, 6 Apr 2019 15:00:14 -0600 Subject: [maker-devel] Running SNAP with MAKER In-Reply-To: References: Message-ID: <1B661B06-3142-40D8-AEF5-A387397AA91E@gmail.com> The error is being thrown by snap itself. Perhaps there is an issue with the genome.hmm file. Did you generate the file immediately previously to this run? Perhaps you can redo that process, and review any errors that come up during training. Some details on training SNAP from the wiki ?> http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial_for_WGS_Assembly_and_Annotation_Winter_School_2018#Training_ab_initio_Gene_Predictors ?Carson > On Apr 4, 2019, at 4:46 PM, Paul Sheridan wrote: > > Dear MAKER Team, > > I am running MAKER 2.31.10 a 32 core instance. My first pass completed successfully. However, my second pass using SNAP and Augustus trained ab initio gene predictions failed. Here is some example output which illustrates the problem: > > MAKER WARNING: Changes in control files make re-use of all old data impossible > All old files will be erased before continuing > processing all repeats > doing repeat masking > doing repeat masking > #--------------------------------------------------------------------- > Now starting the contig!! > SeqID: scf7180000008677_pilon_pilon > Length: 49996 > #--------------------------------------------------------------------- > > doing repeat masking > preparing ab-inits > running snap. > #--------- command -------------# > Widget::snap: > /usr/bin/snap /root/tuna-round-2/genome.maker.output/snap/round1/genome.hmm /tmp/maker_8RuX8Z/scf718 > 0000006915_pilon_pilon.abinit_masked.0 > /tmp/maker_8RuX8Z/scf7180000006915_pilon_pilon.abinit_maske > d.0.genome%2Ehmm.snap > #-------------------------------# > setting up GFF3 output and fasta chunks > processing all repeats > doing repeat masking > in cluster::shadow_cluster... > ...finished clustering. > error: unknown command "/root/tuna-round-2/genome.maker.output/snap/round1/genome.hmm", see 'snap help' > ERROR: Snap failed > --> rank=21, hostname=localhost > ERROR: Failed while preparing ab-inits > ERROR: Chunk failed at level:0, tier_type:2 > FAILED CONTIG:scf7180000007575_pilon_pilon > > I confirmed that the path to genome.hmm is correct. In addition, run.log contains the following kind of output: > > STARTED genome.maker.output/genome_datastore/00/6E/scf7180000008677_pilon_pilon//theVoid.scf7180000008677_pilon_pilon/scf7180000008677_pilon_pilon.abinit_m > asked.0.genome%2Ehmm.snap > DIED RANK 30:4:0:0 > DIED COUNT 2 > DIED RANK 30 > DIED COUNT 2 > > How can I resolve this issue? > > Also, is the warning about it being impossible to use the old data to be expected? > > Attached files: > - maker_otps1.ctl: first pass control file > - maker_opts2.ctl: second pass control file > - run.log: log file for an example contig > > Thanks in Advance, > > Paul Sheridan > > -- > CSO at Tupac Bio > Email: paul at tupac.bio > Homepage: www.paulsheridan.net > Mobile: +81 80 7889 0859 > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul at tupac.bio Sun Apr 7 03:27:53 2019 From: paul at tupac.bio (Paul Sheridan) Date: Sun, 7 Apr 2019 18:27:53 +0900 Subject: [maker-devel] Running SNAP with MAKER In-Reply-To: <1B661B06-3142-40D8-AEF5-A387397AA91E@gmail.com> References: <1B661B06-3142-40D8-AEF5-A387397AA91E@gmail.com> Message-ID: Hi Carson, Indeed, I did generate the hmm file immediately previously to my second run. I redid the process by following these commands from the link you supplied: mkdir snap cd snap gff3_merge -d /root/tuna-round-2/genome.maker.output/genome_master_datastore_index.log maker2zff genome.all.gff fathom -categorize 1000 genome.ann genome.dna fathom -export 1000 -plus uni.ann uni.dna forge export.ann export.dna hmm-assembler.pl genome . > ../genome1.hmm I didn't find any errors generated by Snap during training. But when I reran MAKER, I got errors of this variety: processing all repeats processing all repeats error: unknown command "/root/tuna-round-2/genome.maker.output/genome1.hmm", see 'snap help' error: unknown command "/root/tuna-round-2/genome.maker.output/genome1.hmm", see 'snap help' preparing masked sequence processing all repeats collecting blastx repeatmasking preparing masked sequence collecting blastx repeatmasking collecting blastx repeatmasking processing all repeats processing all repeats processing all repeats processing all repeats preparing masked sequence collecting blastx repeatmasking ERROR: Snap failed --> rank=21, hostname=localhost ERROR: Failed while preparing ab-inits ERROR: Chunk failed at level:0, tier_type:2 FAILED CONTIG:scf7180000008536_pilon_pilon ERROR: Snap failed --> rank=5, hostname=localhost ERROR: Failed while preparing ab-inits ERROR: Chunk failed at level:0, tier_type:2 FAILED CONTIG:scf7180000008522_pilon_pilon ERROR: Chunk failed at level:4, tier_type:0 FAILED CONTIG:scf7180000008536_pilon_pilon preparing masked sequence ERROR: Chunk failed at level:4, tier_type:0 FAILED CONTIG:scf7180000008522_pilon_pilon Do you have any other suggestions? Thanks in Advance, Paul On Sun, Apr 7, 2019 at 6:00 AM Carson Holt wrote: > The error is being thrown by snap itself. Perhaps there is an issue with > the genome.hmm file. Did you generate the file immediately previously to > this run? Perhaps you can redo that process, and review any errors that > come up during training. > > Some details on training SNAP from the wiki ?> > http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial_for_WGS_Assembly_and_Annotation_Winter_School_2018#Training_ab_initio_Gene_Predictors > > ?Carson > > > On Apr 4, 2019, at 4:46 PM, Paul Sheridan wrote: > > Dear MAKER Team, > > I am running MAKER 2.31.10 a 32 core instance. My first pass completed > successfully. However, my second pass using SNAP and Augustus trained ab > initio gene predictions failed. Here is some example output which > illustrates the problem: > > MAKER WARNING: Changes in control files make re-use of all old data > impossible > All old files will be erased before continuing > processing all repeats > doing repeat masking > doing repeat masking > #--------------------------------------------------------------------- > Now starting the contig!! > SeqID: scf7180000008677_pilon_pilon > Length: 49996 > #--------------------------------------------------------------------- > > doing repeat masking > preparing ab-inits > running snap. > #--------- command -------------# > Widget::snap: > /usr/bin/snap > /root/tuna-round-2/genome.maker.output/snap/round1/genome.hmm > /tmp/maker_8RuX8Z/scf718 > 0000006915_pilon_pilon.abinit_masked.0 > > /tmp/maker_8RuX8Z/scf7180000006915_pilon_pilon.abinit_maske > d.0.genome%2Ehmm.snap > #-------------------------------# > setting up GFF3 output and fasta chunks > processing all repeats > doing repeat masking > in cluster::shadow_cluster... > ...finished clustering. > error: unknown command > "/root/tuna-round-2/genome.maker.output/snap/round1/genome.hmm", see 'snap > help' > ERROR: Snap failed > --> rank=21, hostname=localhost > ERROR: Failed while preparing ab-inits > ERROR: Chunk failed at level:0, tier_type:2 > FAILED CONTIG:scf7180000007575_pilon_pilon > > I confirmed that the path to genome.hmm is correct. In addition, run.log > contains the following kind of output: > > STARTED > genome.maker.output/genome_datastore/00/6E/scf7180000008677_pilon_pilon//theVoid.scf7180000008677_pilon_pilon/scf7180000008677_pilon_pilon.abinit_m > asked.0.genome%2Ehmm.snap > DIED RANK 30:4:0:0 > DIED COUNT 2 > DIED RANK 30 > DIED COUNT 2 > > How can I resolve this issue? > > Also, is the warning about it being impossible to use the old data to be > expected? > > Attached files: > - maker_otps1.ctl: first pass control file > - maker_opts2.ctl: second pass control file > - run.log: log file for an example contig > > Thanks in Advance, > > Paul Sheridan > > -- > CSO at Tupac Bio > Email: paul at tupac.bio > Homepage: www.paulsheridan.net > Mobile: +81 80 7889 0859 > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > -- CSO at Tupac Bio Email: paul at tupac.bio Homepage: www.paulsheridan.net Mobile: +81 80 7889 0859 -------------- next part -------------- An HTML attachment was scrubbed... URL: From liorglic at mail.tau.ac.il Sun Apr 7 07:25:22 2019 From: liorglic at mail.tau.ac.il (Lior Glick) Date: Sun, 7 Apr 2019 16:25:22 +0300 Subject: [maker-devel] Curious pattern in AED distributions Message-ID: Hi MAKER users, Lately I've been performing annotations for multiple genomes from the same species. When plotting the histogram of AED scores over all genes, I repeatedly see a very specific pattern, that looks something like this: [image: AED_hist.png] This pattern is a bit surprising to me, in two aspects: 1) Why is there a surge towards 0.5? 2) Why is there a sudden drop right after that surge? Has anyone else seen this, or is this a specific outcome of my data/configuration? Any ideas of what may cause such a distribution? While this is not necessarily an indication of a problem or bug, it does seem a bit odd, and might imply some bias or artifact. Would appreciate your comments. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: AED_hist.png Type: image/png Size: 8232 bytes Desc: not available URL: From myandell at genetics.utah.edu Sun Apr 7 09:11:36 2019 From: myandell at genetics.utah.edu (Mark Yandell) Date: Sun, 7 Apr 2019 15:11:36 +0000 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: Message-ID: <805319DB-37C6-4802-A5A0-F74BFBD7BAA1@umail.utah.edu> Hi Lior, Fun! The short answer is I don?t know. Obviously, the good stuff is on the right side of 0.5. That said, I can think of a couple of things to look into to explain the left side of the graph. Are you allowing single exon genes? Are you using RNA seq data, protein, or both? What about repeat masking? Are you doing it? Do you have your own library? My first guess, would be low complexity/repeat sequences generating more or less random blastx hits across the genome?Carson, what do you think? And finally, what does the AED look like for the genes included in the final build? Sorry for all the questions, Lior. That?s your punishment for asking an interesting one. ? --mark From: maker-devel on behalf of Lior Glick Date: Sunday, April 7, 2019 at 7:26 AM To: "maker-devel at yandell-lab.org" Subject: [maker-devel] Curious pattern in AED distributions Hi MAKER users, Lately I've been performing annotations for multiple genomes from the same species. When plotting the histogram of AED scores over all genes, I repeatedly see a very specific pattern, that looks something like this: [AED_hist.png] This pattern is a bit surprising to me, in two aspects: 1) Why is there a surge towards 0.5? 2) Why is there a sudden drop right after that surge? Has anyone else seen this, or is this a specific outcome of my data/configuration? Any ideas of what may cause such a distribution? While this is not necessarily an indication of a problem or bug, it does seem a bit odd, and might imply some bias or artifact. Would appreciate your comments. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 8233 bytes Desc: image001.png URL: From myandell at genetics.utah.edu Sun Apr 7 11:39:16 2019 From: myandell at genetics.utah.edu (Mark Yandell) Date: Sun, 7 Apr 2019 17:39:16 +0000 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: <805319DB-37C6-4802-A5A0-F74BFBD7BAA1@umail.utah.edu> Message-ID: <116090CF-13B6-4E54-A5AA-8F7D7FCF2F23@umail.utah.edu> ? Sorry. I?m dyslexic, especially early in the morning. Yes, good stuff is on the left. As regards single exon genes, that?s always a hard call, as these have a higher false positive rate. Things to consider are how prevalent are introns in your org? Cason can give more advice on this point, I?m sure. ? ? By ?"final build", I meant is this using the ?Standard build? or ?Max Build? protocol from PMC4286374? From: Lior Glick Date: Sunday, April 7, 2019 at 10:29 AM To: Mark Yandell Cc: "liorglic at mail.tau.ac.il" , "maker-devel at yandell-lab.org" Subject: Re: [maker-devel] Curious pattern in AED distributions Dear Mark, Thank you for the quick reply. I'm happy to see this ignites your interest and am willing to endure your punishing questions (; Before I answer them, I just want to make sure we're on the same page - as far as I understand, lower AED scores indicate higher agreement with the evidence, so the "good stuff" is actually left of the 0.5 surge. Am I correct? Otherwise, this is a very poor annotation... Now for the questions: 1) I did not make any filtrations so far, so single exon genes are included as well. in fact, I'm exploring the results in order to develop some criteria for filtering the genes. Would you suggest discarding single exon genes? 2) My evidence consist of assembled transcripts, proteins and predicted gene models (pred_gff). 3) As for repeats, I'm masking based on a repeats library obtained from a previous publication, specific to my organism of interest. Unfortunately, I didn't understand your final question. Could you please explain what you mean by "final build"? Hope these answers are helpful, and waiting to hear more thoughts. Thanks again. On Sun, Apr 7, 2019, 18:11 Mark Yandell > wrote: Hi Lior, Fun! The short answer is I don?t know. Obviously, the good stuff is on the right side of 0.5. That said, I can think of a couple of things to look into to explain the left side of the graph. Are you allowing single exon genes? Are you using RNA seq data, protein, or both? What about repeat masking? Are you doing it? Do you have your own library? My first guess, would be low complexity/repeat sequences generating more or less random blastx hits across the genome?Carson, what do you think? And finally, what does the AED look like for the genes included in the final build? Sorry for all the questions, Lior. That?s your punishment for asking an interesting one. ? --mark From: maker-devel > on behalf of Lior Glick > Date: Sunday, April 7, 2019 at 7:26 AM To: "maker-devel at yandell-lab.org" > Subject: [maker-devel] Curious pattern in AED distributions Hi MAKER users, Lately I've been performing annotations for multiple genomes from the same species. When plotting the histogram of AED scores over all genes, I repeatedly see a very specific pattern, that looks something like this: [AED_hist.png] This pattern is a bit surprising to me, in two aspects: 1) Why is there a surge towards 0.5? 2) Why is there a sudden drop right after that surge? Has anyone else seen this, or is this a specific outcome of my data/configuration? Any ideas of what may cause such a distribution? While this is not necessarily an indication of a problem or bug, it does seem a bit odd, and might imply some bias or artifact. Would appreciate your comments. Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From ychliu at genetics.ac.cn Tue Apr 2 19:21:33 2019 From: ychliu at genetics.ac.cn (ychliu at genetics.ac.cn) Date: Wed, 3 Apr 2019 09:21:33 +0800 Subject: [maker-devel] MAKER problem with gff3 file Message-ID: <2019040309213197334742@genetics.ac.cn> Dear MAKER developers, I recently use the MAKER to do gene annotation. But even I use the gff3 file as the EST evidence, the result shows no gene that marked by est2genome (I do use the parameter est2genome=1). It may means that the gff3 seems doesn't work. So what's the problem? How can I solve it? Eager for you assistance. Faithfully yours. Yucheng Liu Yucheng Liu Institute of Genetics and Developmental Biology, CAS Beijing, 100101 China Tel: 86-010-64801362 E-mail: ychliu at genetics.ac.cn -------------- next part -------------- An HTML attachment was scrubbed... URL: From liorglck at gmail.com Sun Apr 7 10:29:13 2019 From: liorglck at gmail.com (Lior Glick) Date: Sun, 7 Apr 2019 19:29:13 +0300 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: <805319DB-37C6-4802-A5A0-F74BFBD7BAA1@umail.utah.edu> References: <805319DB-37C6-4802-A5A0-F74BFBD7BAA1@umail.utah.edu> Message-ID: Dear Mark, Thank you for the quick reply. I'm happy to see this ignites your interest and am willing to endure your punishing questions (; Before I answer them, I just want to make sure we're on the same page - as far as I understand, lower AED scores indicate higher agreement with the evidence, so the "good stuff" is actually left of the 0.5 surge. Am I correct? Otherwise, this is a very poor annotation... Now for the questions: 1) I did not make any filtrations so far, so single exon genes are included as well. in fact, I'm exploring the results in order to develop some criteria for filtering the genes. Would you suggest discarding single exon genes? 2) My evidence consist of assembled transcripts, proteins and predicted gene models (pred_gff). 3) As for repeats, I'm masking based on a repeats library obtained from a previous publication, specific to my organism of interest. Unfortunately, I didn't understand your final question. Could you please explain what you mean by "final build"? Hope these answers are helpful, and waiting to hear more thoughts. Thanks again. On Sun, Apr 7, 2019, 18:11 Mark Yandell wrote: > Hi Lior, > > > > > > Fun! The short answer is I don?t know. Obviously, the good stuff is on the > right side of 0.5. > > That said, I can think of a couple of things to look into to explain the > left side of the graph. Are you allowing single exon genes? Are you using > RNA seq data, protein, or both? What about repeat masking? Are you doing > it? Do you have your own library? > > > > My first guess, would be low complexity/repeat sequences generating more > or less random blastx hits across the genome?Carson, what do you think? > > > > And finally, what does the AED look like for the genes included in the > final build? > > > > > > Sorry for all the questions, Lior. That?s your punishment for asking an > interesting one. ? > > > > --mark > > > > > > *From: *maker-devel on behalf of > Lior Glick > *Date: *Sunday, April 7, 2019 at 7:26 AM > *To: *"maker-devel at yandell-lab.org" > *Subject: *[maker-devel] Curious pattern in AED distributions > > > > Hi MAKER users, > > Lately I've been performing annotations for multiple genomes from the same > species. > > When plotting the histogram of AED scores over all genes, I repeatedly see > a very specific pattern, that looks something like this: > > [image: AED_hist.png] > > This pattern is a bit surprising to me, in two aspects: > > 1) Why is there a surge towards 0.5? > > 2) Why is there a sudden drop right after that surge? > > > > Has anyone else seen this, or is this a specific outcome of my > data/configuration? > > Any ideas of what may cause such a distribution? > > > > While this is not necessarily an indication of a problem or bug, it does > seem a bit odd, and might imply some bias or artifact. > > Would appreciate your comments. > > Thank you! > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 8233 bytes Desc: not available URL: From carsonhh at gmail.com Sun Apr 7 19:06:49 2019 From: carsonhh at gmail.com (Carson Holt) Date: Sun, 7 Apr 2019 19:06:49 -0600 Subject: [maker-devel] MAKER problem with gff3 file In-Reply-To: <2019040309213197334742@genetics.ac.cn> References: <2019040309213197334742@genetics.ac.cn> Message-ID: <961D15D1-36C0-4DD9-BE81-7C652A2C4CCF@gmail.com> The est2genome=1 one option in MAKER2 only works with input fasta files because it?s based on Exonerate?s est2genome alignments. It does not with with GFF3 input (gff3 is missing some things that are in the exonerate report). MAKER3 however will let you do this with GFF3 input (it goes back and tries to predict missing info that Exonerate would have produced). ?Carson > On Apr 2, 2019, at 7:21 PM, ychliu at genetics.ac.cn wrote: > > Dear MAKER developers, > I recently use the MAKER to do gene annotation. But even I use the gff3 file as the EST evidence, the result shows no gene that marked by est2genome (I do use the parameter est2genome=1). It may means that the gff3 seems doesn't work. So what's the problem? How can I solve it? Eager for you assistance. > Faithfully yours. > Yucheng Liu > > Yucheng Liu > Institute of Genetics and Developmental Biology, CAS > Beijing, 100101 China > Tel: 86-010-64801362 > E-mail: ychliu at genetics.ac.cn _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Sun Apr 7 19:08:54 2019 From: carsonhh at gmail.com (Carson Holt) Date: Sun, 7 Apr 2019 19:08:54 -0600 Subject: [maker-devel] Installation Failure with pg_config? In-Reply-To: References: Message-ID: <89174279-57D0-46D3-BE9D-FA03ED861227@gmail.com> DBD::Pg is optional. You should be able to say ?No? to the question on whether you want to install optional modules during the Build step. ?Carson > On Mar 14, 2019, at 4:24 PM, Shaowen Jiang wrote: > > Dear MAKER2 admins: > > Hi, I have read some tutorials for annotating a newly assembly genome and MAKER2 seems to be a very good and functional pipeline to me. So I am trying to use it to annotate a new assembly mammalian genome that our lab just generated. > But I was stuck while I was trying to install MAKER2 to our slurm HPC server. > I think the pipeline is trying to install several perl packages locally, but one of them called DBD::Pg requires the path of pg_config? > screenshot as below > > But I think our server doesn't have this path and I don't have root to install some other stuff, like libpq-dev or PostgreSQL. > Is that any other methods that can circle around that? > > Any help or advice would be appreciated! > > Thanks, > Shaowen > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Sun Apr 7 19:32:39 2019 From: carsonhh at gmail.com (Carson Holt) Date: Sun, 7 Apr 2019 19:32:39 -0600 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: Message-ID: That?s interesting. It could be a handful of internal filters that help with spurious results. I use a 0.5 sensitivity/specificity to identify shared edges for a jaccardian split on overlapping evidence clusters for example. There are also a couple of places where if the only thing supporting a model is a single exon blastx hit (i.e. no exonerate, ab initio model, or est splice support, but just a chunk od single exon blastx) then maker will use a reading frame aware AED value of 0.5 as a filter (as in it checks if the reading frame matches and not just raw overlap). If that?s the case, the spike near 0.5 may indicate I needed to be a little strickter than my empirical cutoff estimate. Perhaps 0.4 or 0.45 would be the better cuttoff for these spurious blastx induced models. ?Carson > On Apr 7, 2019, at 7:25 AM, Lior Glick wrote: > > Hi MAKER users, > Lately I've been performing annotations for multiple genomes from the same species. > When plotting the histogram of AED scores over all genes, I repeatedly see a very specific pattern, that looks something like this: > > This pattern is a bit surprising to me, in two aspects: > 1) Why is there a surge towards 0.5? > 2) Why is there a sudden drop right after that surge? > > Has anyone else seen this, or is this a specific outcome of my data/configuration? > Any ideas of what may cause such a distribution? > > While this is not necessarily an indication of a problem or bug, it does seem a bit odd, and might imply some bias or artifact. > Would appreciate your comments. > Thank you! > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org From xvazquezc at gmail.com Sun Apr 7 22:42:15 2019 From: xvazquezc at gmail.com (=?UTF-8?Q?Xabier_V=C3=A1zquez=2DCampos?=) Date: Mon, 8 Apr 2019 14:42:15 +1000 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: Message-ID: If you train SNAP, the maker2zff script has internal quality cutoffs based on the existence of evidence. e.g. by default it will require having some EST evidence On Mon, 8 Apr 2019 at 11:32, Carson Holt wrote: > That?s interesting. It could be a handful of internal filters that help > with spurious results. > > I use a 0.5 sensitivity/specificity to identify shared edges for a > jaccardian split on overlapping evidence clusters for example. There are > also a couple of places where if the only thing supporting a model is a > single exon blastx hit (i.e. no exonerate, ab initio model, or est splice > support, but just a chunk od single exon blastx) then maker will use a > reading frame aware AED value of 0.5 as a filter (as in it checks if the > reading frame matches and not just raw overlap). If that?s the case, the > spike near 0.5 may indicate I needed to be a little strickter than my > empirical cutoff estimate. Perhaps 0.4 or 0.45 would be the better cuttoff > for these spurious blastx induced models. > > ?Carson > > > > On Apr 7, 2019, at 7:25 AM, Lior Glick wrote: > > > > Hi MAKER users, > > Lately I've been performing annotations for multiple genomes from the > same species. > > When plotting the histogram of AED scores over all genes, I repeatedly > see a very specific pattern, that looks something like this: > > > > This pattern is a bit surprising to me, in two aspects: > > 1) Why is there a surge towards 0.5? > > 2) Why is there a sudden drop right after that surge? > > > > Has anyone else seen this, or is this a specific outcome of my > data/configuration? > > Any ideas of what may cause such a distribution? > > > > While this is not necessarily an indication of a problem or bug, it does > seem a bit odd, and might imply some bias or artifact. > > Would appreciate your comments. > > Thank you! > > _______________________________________________ > > maker-devel mailing list > > maker-devel at box290.bluehost.com > > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > -- Xabier V?zquez-Campos, *PhD* *Research Associate* NSW Systems Biology Initiative School of Biotechnology and Biomolecular Sciences The University of New South Wales Sydney NSW 2052 AUSTRALIA -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Sun Apr 7 23:20:24 2019 From: carsonhh at gmail.com (Carson Holt) Date: Sun, 7 Apr 2019 23:20:24 -0600 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: Message-ID: <480A2430-C312-4A43-B659-4694B4F8E61A@gmail.com> Yes. maker2zff tries to further select a subset of the best supported models by requiring multiple forms of evidence support. ?Carson > On Apr 7, 2019, at 10:42 PM, Xabier V?zquez-Campos wrote: > > If you train SNAP, the maker2zff script has internal quality cutoffs based on the existence of evidence. e.g. by default it will require having some EST evidence > > On Mon, 8 Apr 2019 at 11:32, Carson Holt > wrote: > That?s interesting. It could be a handful of internal filters that help with spurious results. > > I use a 0.5 sensitivity/specificity to identify shared edges for a jaccardian split on overlapping evidence clusters for example. There are also a couple of places where if the only thing supporting a model is a single exon blastx hit (i.e. no exonerate, ab initio model, or est splice support, but just a chunk od single exon blastx) then maker will use a reading frame aware AED value of 0.5 as a filter (as in it checks if the reading frame matches and not just raw overlap). If that?s the case, the spike near 0.5 may indicate I needed to be a little strickter than my empirical cutoff estimate. Perhaps 0.4 or 0.45 would be the better cuttoff for these spurious blastx induced models. > > ?Carson > > > > On Apr 7, 2019, at 7:25 AM, Lior Glick > wrote: > > > > Hi MAKER users, > > Lately I've been performing annotations for multiple genomes from the same species. > > When plotting the histogram of AED scores over all genes, I repeatedly see a very specific pattern, that looks something like this: > > > > This pattern is a bit surprising to me, in two aspects: > > 1) Why is there a surge towards 0.5? > > 2) Why is there a sudden drop right after that surge? > > > > Has anyone else seen this, or is this a specific outcome of my data/configuration? > > Any ideas of what may cause such a distribution? > > > > While this is not necessarily an indication of a problem or bug, it does seem a bit odd, and might imply some bias or artifact. > > Would appreciate your comments. > > Thank you! > > _______________________________________________ > > maker-devel mailing list > > maker-devel at box290.bluehost.com > > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org > > > -- > Xabier V?zquez-Campos, PhD > Research Associate > NSW Systems Biology Initiative > School of Biotechnology and Biomolecular Sciences > The University of New South Wales > Sydney NSW 2052 AUSTRALIA -------------- next part -------------- An HTML attachment was scrubbed... URL: From liorglic at mail.tau.ac.il Mon Apr 8 00:54:06 2019 From: liorglic at mail.tau.ac.il (Lior Glick) Date: Mon, 8 Apr 2019 09:54:06 +0300 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: <480A2430-C312-4A43-B659-4694B4F8E61A@gmail.com> References: <480A2430-C312-4A43-B659-4694B4F8E61A@gmail.com> Message-ID: Hello again and thank you all for your interesting answers. I mistakenly answered Mark yesterday from an unsubscribed mail, which resulted in only him getting it, so for documentation sake, I'm posting my answer here again, and Mark's reply: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Dear Mark, Thank you for the quick reply. I'm happy to see this ignites your interest and am willing to endure your punishing questions (; Before I answer them, I just want to make sure we're on the same page - as far as I understand, lower AED scores indicate higher agreement with the evidence, so the "good stuff" is actually left of the 0.5 surge. Am I correct? Otherwise, this is a very poor annotation... Now for the questions: 1) I did not make any filtrations so far, so single exon genes are included as well. in fact, I'm exploring the results in order to develop some criteria for filtering the genes. Would you suggest discarding single exon genes? 2) My evidence consist of assembled transcripts, proteins and predicted gene models (pred_gff). 3) As for repeats, I'm masking based on a repeats library obtained from a previous publication, specific to my organism of interest. Unfortunately, I didn't understand your final question. Could you please explain what you mean by "final build"? Hope these answers are helpful, and waiting to hear more thoughts. Thanks again. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *To which Mark replied:* Sorry. I?m dyslexic, especially early in the morning. Yes, good stuff is on the left. As regards single exon genes, that?s always a hard call, as these have a higher false positive rate. Things to consider are how prevalent are introns in your org? Cason can give more advice on this point, I?m sure. ? ? By ?"final build", I meant is this using the ?Standard build? or ?Max Build? protocol from PMC4286374? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Mark - well, as I said I haven't done any filtration yet, so I guess my annotation currently includes genes that would be discarded even with the "max build". I'll give this a try and look at the resulting distribution. Xabier - thanks, but I'm not using SNAP (just Augustus). Carson - I see a few fingers pointing in the direction of single-exon models, so maybe I should see what happens to the distribution of AED when these genes are removed. I'll get back to you with some more results. ??????? ??? ??, 8 ????? 2019 ?-8:20 ??? ?Carson Holt?? :? > Yes. maker2zff tries to further select a subset of the best supported > models by requiring multiple forms of evidence support. > > ?Carson > > > On Apr 7, 2019, at 10:42 PM, Xabier V?zquez-Campos > wrote: > > If you train SNAP, the maker2zff script has internal quality cutoffs based > on the existence of evidence. e.g. by default it will require having some > EST evidence > > On Mon, 8 Apr 2019 at 11:32, Carson Holt wrote: > >> That?s interesting. It could be a handful of internal filters that help >> with spurious results. >> >> I use a 0.5 sensitivity/specificity to identify shared edges for a >> jaccardian split on overlapping evidence clusters for example. There are >> also a couple of places where if the only thing supporting a model is a >> single exon blastx hit (i.e. no exonerate, ab initio model, or est splice >> support, but just a chunk od single exon blastx) then maker will use a >> reading frame aware AED value of 0.5 as a filter (as in it checks if the >> reading frame matches and not just raw overlap). If that?s the case, the >> spike near 0.5 may indicate I needed to be a little strickter than my >> empirical cutoff estimate. Perhaps 0.4 or 0.45 would be the better cuttoff >> for these spurious blastx induced models. >> >> ?Carson >> >> >> > On Apr 7, 2019, at 7:25 AM, Lior Glick wrote: >> > >> > Hi MAKER users, >> > Lately I've been performing annotations for multiple genomes from the >> same species. >> > When plotting the histogram of AED scores over all genes, I repeatedly >> see a very specific pattern, that looks something like this: >> > >> > This pattern is a bit surprising to me, in two aspects: >> > 1) Why is there a surge towards 0.5? >> > 2) Why is there a sudden drop right after that surge? >> > >> > Has anyone else seen this, or is this a specific outcome of my >> data/configuration? >> > Any ideas of what may cause such a distribution? >> > >> > While this is not necessarily an indication of a problem or bug, it >> does seem a bit odd, and might imply some bias or artifact. >> > Would appreciate your comments. >> > Thank you! >> > _______________________________________________ >> > maker-devel mailing list >> > maker-devel at box290.bluehost.com >> > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> >> >> _______________________________________________ >> maker-devel mailing list >> maker-devel at box290.bluehost.com >> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org >> > > > -- > Xabier V?zquez-Campos, *PhD* > *Research Associate* > NSW Systems Biology Initiative > School of Biotechnology and Biomolecular Sciences > The University of New South Wales > Sydney NSW 2052 AUSTRALIA > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From liorglic at mail.tau.ac.il Mon Apr 8 03:10:15 2019 From: liorglic at mail.tau.ac.il (Lior Glick) Date: Mon, 8 Apr 2019 12:10:15 +0300 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: <480A2430-C312-4A43-B659-4694B4F8E61A@gmail.com> Message-ID: Hi again - quick update: I made a plot comparing the histograms of single-exon genes to multi-exon genes: [image: newplot (5).png] It definitely looks like single-exon genes are *enriched* for the 0.5 score, but it does not account for the entire surge, as there also seem to be lots of multi-exon genes involved. This may suggest that the 0.5 peak is a result of multiple effects buried within the software. Any other thoughts/suggestions? Thanks again, -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: newplot (5).png Type: image/png Size: 18037 bytes Desc: not available URL: From carsonhh at gmail.com Mon Apr 8 10:48:42 2019 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 8 Apr 2019 10:48:42 -0600 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: <480A2430-C312-4A43-B659-4694B4F8E61A@gmail.com> Message-ID: One note. When I say single exon blastx hit, I mean that the evidence is single exon, not that the gene model is single exon. What I think you are seeing is an effect that seems to be partially related to under-masking, i.e. a spurious partial blastx alignment to a low complexity repeat (which is why the blastx protein alignment refuses to polish with exonerate). That is why the filter was added. So if a model (single or multi-exon) has no additional ab initio prediction support, has no EST support, and has no exonerate polished protein support, but does have a single-exon/single-hsp blastx overlap it gets filtered out at 0.5 (that threshold based on trial and error on a couple of genomes where we saw this occur - but your graph suggests that filter might be too loose and 0.4 or 0.45 might be a better value). So the spike is caused by poor blastx and under-masking (this may be explained if your are using in pred_gff models that were generated on an unmasked assembly outside of MAKER), then the drop around 0.5 is caused by MAKER filtering out models only supported by what appears to be spuious blastx alignments. ?Carson > On Apr 8, 2019, at 3:10 AM, Lior Glick wrote: > > Hi again - quick update: > I made a plot comparing the histograms of single-exon genes to multi-exon genes: > > It definitely looks like single-exon genes are enriched for the 0.5 score, but it does not account for the entire surge, as there also seem to be lots of multi-exon genes involved. This may suggest that the 0.5 peak is a result of multiple effects buried within the software. > Any other thoughts/suggestions? > > Thanks again, > -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Apr 8 10:51:55 2019 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 8 Apr 2019 10:51:55 -0600 Subject: [maker-devel] Curious pattern in AED distributions In-Reply-To: References: <480A2430-C312-4A43-B659-4694B4F8E61A@gmail.com> Message-ID: <75B7E2C9-2D1B-452F-BEED-704289C881ED@gmail.com> Try also adding 2 exon models to the graph. It would be interesting to see if these are attempted single-exon models where the predictor added a micro-intron to keep the open reading frame going against a single exon blastx hint. ?Carson > On Apr 8, 2019, at 3:10 AM, Lior Glick wrote: > > Hi again - quick update: > I made a plot comparing the histograms of single-exon genes to multi-exon genes: > > It definitely looks like single-exon genes are enriched for the 0.5 score, but it does not account for the entire surge, as there also seem to be lots of multi-exon genes involved. This may suggest that the 0.5 peak is a result of multiple effects buried within the software. > Any other thoughts/suggestions? > > Thanks again, > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ying.hu at ufl.edu Wed Apr 17 09:20:00 2019 From: ying.hu at ufl.edu (Hu,Ying) Date: Wed, 17 Apr 2019 15:20:00 +0000 Subject: [maker-devel] maker exons number Message-ID: Hi, Carson, I am using MAKER 2.31.6 to annotate a genome. I noticed that exon number in each gene does not start from 1. Can you give me some suggestions how to change the exon number to 1,2,3 .. In each gene? Thansks, Ying Here are some examples: tig00000226|arrow . contig 1 43850 . . . ID=tig00000226|arrow;Name=tig00000226|arrow tig00000226|arrow maker gene 26339 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0;Name=augustus_masked-tig00000226|arrow-processed-gene-0.0 tig00000226|arrow maker mRNA 26339 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0;Name=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1;_AED=0.47;_eAED=0.68;_QI=0|0|0|0.75|1|1|4|0|82 tig00000226|arrow maker exon 26339 26353 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:58;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker exon 27490 27636 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:59;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker exon 27738 27808 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:60;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker exon 27900 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:61;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 26339 26353 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 27490 27636 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 27738 27808 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 27900 27915 . + 1 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker gene 5803 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1;Name=maker-tig00000226|arrow-augustus-gene-0.1 tig00000226|arrow maker mRNA 5803 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1;Parent=maker-tig00000226|arrow-augustus-gene-0.1;Name=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1;_AED=0.63;_eAED=0.69;_QI=0|0|0|1|0|0|2|0|85 tig00000226|arrow maker exon 5803 5975 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:exon:62;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00000226|arrow maker exon 6505 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:exon:63;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00000226|arrow maker CDS 5803 5975 . + 0 ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00000226|arrow maker CDS 6505 6589 . + 1 ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow . contig 1 104941 . . . ID=tig00034405|arrow;Name=tig00034405|arrow tig00034405|arrow maker gene 40927 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0;Name=maker-tig00034405|arrow-augustus-gene-0.0 tig00034405|arrow maker mRNA 40927 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1;Parent=maker-tig00034405|arrow-augustus-gene-0.0;Name=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1;_AED=0.04;_eAED=0.04;_QI=266|1|1|1|0|0|3|0|100 tig00034405|arrow maker exon 40927 41273 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7157;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker exon 41476 41622 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7158;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker exon 50954 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7159;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker five_prime_UTR 40927 41192 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:five_prime_utr;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker CDS 41193 41273 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker CDS 41476 41622 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker CDS 50954 51025 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker gene 57931 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2;Name=augustus_masked-tig00034405|arrow-processed-gene-0.2 tig00034405|arrow maker mRNA 57931 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2;Name=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1;_AED=0.13;_eAED=0.13;_QI=0|0.5|0|0.66|1|1|3|0|522 tig00034405|arrow maker exon 57931 58962 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7160;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker exon 59245 59725 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7161;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker exon 61510 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7162;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker CDS 57931 58962 . + 0 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker CDS 59245 59725 . + 0 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker CDS 61510 61565 . + 2 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker gene 90355 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1;Name=maker-tig00034405|arrow-augustus-gene-0.1 tig00034405|arrow maker mRNA 90355 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1;Parent=maker-tig00034405|arrow-augustus-gene-0.1;Name=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1;_AED=0.28;_eAED=0.28;_QI=0|0.7|0.72|1|1|1|11|386|425 tig00034405|arrow maker exon 90355 90911 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7173;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 91011 91086 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7172;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 91179 91240 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7171;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 91557 91706 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7170;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 92996 93064 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7169;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93156 93347 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7168;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93453 93637 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7167;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93711 93786 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7166;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93866 93972 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7165;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 94536 94573 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7164;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 94645 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7163;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 94645 94796 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 94536 94573 . - 1 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93866 93972 . - 2 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93711 93786 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93453 93637 . - 2 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93156 93347 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 92996 93064 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 91557 91706 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 91179 91240 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 91011 91086 . - 1 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 90741 90911 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker three_prime_UTR 90355 90740 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:three_prime_utr;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From huyingwin at gmail.com Wed Apr 17 09:23:10 2019 From: huyingwin at gmail.com (YING HU) Date: Wed, 17 Apr 2019 11:23:10 -0400 Subject: [maker-devel] maker exon number Message-ID: Hi, Carson, I am using MAKER 2.31.6 to annotate a genome. I noticed that exon number in each gene does not start from 1. Can you give me some suggestions how to change the exon number to 1,2,3 .. In each gene? Thansks, Ying Here are some examples: tig00000226|arrow . contig 1 43850 . . . ID=tig00000226|arrow;Name=tig00000226|arrow tig00000226|arrow maker gene 26339 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0;Name=augustus_masked-tig00000226|arrow-processed-gene-0.0 tig00000226|arrow maker mRNA 26339 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0;Name=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1;_AED=0.47;_eAED=0.68;_QI=0|0|0|0.75|1|1|4|0|82 tig00000226|arrow maker exon 26339 26353 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:58;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker exon 27490 27636 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:59;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker exon 27738 27808 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:60;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker exon 27900 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:61;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 26339 26353 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 27490 27636 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 27738 27808 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker CDS 27900 27915 . + 1 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 tig00000226|arrow maker gene 5803 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1;Name=maker-tig00000226|arrow-augustus-gene-0.1 tig00000226|arrow maker mRNA 5803 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1;Parent=maker-tig00000226|arrow-augustus-gene-0.1;Name=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1;_AED=0.63;_eAED=0.69;_QI=0|0|0|1|0|0|2|0|85 tig00000226|arrow maker exon 5803 5975 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:exon:62;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00000226|arrow maker exon 6505 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:exon:63;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00000226|arrow maker CDS 5803 5975 . + 0 ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00000226|arrow maker CDS 6505 6589 . + 1 ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow . contig 1 104941 . . . ID=tig00034405|arrow;Name=tig00034405|arrow tig00034405|arrow maker gene 40927 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0;Name=maker-tig00034405|arrow-augustus-gene-0.0 tig00034405|arrow maker mRNA 40927 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1;Parent=maker-tig00034405|arrow-augustus-gene-0.0;Name=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1;_AED=0.04;_eAED=0.04;_QI=266|1|1|1|0|0|3|0|100 tig00034405|arrow maker exon 40927 41273 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7157;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker exon 41476 41622 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7158;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker exon 50954 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7159;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker five_prime_UTR 40927 41192 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:five_prime_utr;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker CDS 41193 41273 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker CDS 41476 41622 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker CDS 50954 51025 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 tig00034405|arrow maker gene 57931 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2;Name=augustus_masked-tig00034405|arrow-processed-gene-0.2 tig00034405|arrow maker mRNA 57931 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2;Name=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1;_AED=0.13;_eAED=0.13;_QI=0|0.5|0|0.66|1|1|3|0|522 tig00034405|arrow maker exon 57931 58962 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7160;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker exon 59245 59725 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7161;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker exon 61510 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7162;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker CDS 57931 58962 . + 0 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker CDS 59245 59725 . + 0 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker CDS 61510 61565 . + 2 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 tig00034405|arrow maker gene 90355 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1;Name=maker-tig00034405|arrow-augustus-gene-0.1 tig00034405|arrow maker mRNA 90355 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1;Parent=maker-tig00034405|arrow-augustus-gene-0.1;Name=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1;_AED=0.28;_eAED=0.28;_QI=0|0.7|0.72|1|1|1|11|386|425 tig00034405|arrow maker exon 90355 90911 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7173;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 91011 91086 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7172;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 91179 91240 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7171;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 91557 91706 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7170;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 92996 93064 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7169;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93156 93347 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7168;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93453 93637 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7167;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93711 93786 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7166;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 93866 93972 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7165;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 94536 94573 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7164;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker exon 94645 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7163;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 94645 94796 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 94536 94573 . - 1 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93866 93972 . - 2 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93711 93786 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93453 93637 . - 2 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 93156 93347 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 92996 93064 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 91557 91706 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 91179 91240 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 91011 91086 . - 1 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker CDS 90741 90911 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 tig00034405|arrow maker three_prime_UTR 90355 90740 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:three_prime_utr;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Wed Apr 17 13:43:41 2019 From: carsonhh at gmail.com (Carson Holt) Date: Wed, 17 Apr 2019 13:43:41 -0600 Subject: [maker-devel] maker exon number In-Reply-To: References: Message-ID: The ID= value is simply a unique value to resolve inheritance in conjunction with Parent=. It has no biological meaning. Also with gff3 format, to reduce redundancy, a single ?exon? features can be the child of multiple mRNA features, so a single ?exon' line can be the first exon in one transcript but the second exon in another. ?Carson > On Apr 17, 2019, at 9:23 AM, YING HU wrote: > > Hi, Carson, > > I am using MAKER 2.31.6 to annotate a genome. I noticed that exon number in each gene does not start from 1. Can you give me some suggestions how to change the exon number to 1,2,3 .. In each gene? Thansks, > > Ying > > Here are some examples: > > tig00000226|arrow . contig 1 43850 . . . ID=tig00000226|arrow;Name=tig00000226|arrow > tig00000226|arrow maker gene 26339 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0;Name=augustus_masked-tig00000226|arrow-processed-gene-0.0 > tig00000226|arrow maker mRNA 26339 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0;Name=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1;_AED=0.47;_eAED=0.68;_QI=0|0|0|0.75|1|1|4|0|82 > tig00000226|arrow maker exon 26339 26353 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:58;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker exon 27490 27636 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:59;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker exon 27738 27808 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:60;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker exon 27900 27915 . + . ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:exon:61;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker CDS 26339 26353 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker CDS 27490 27636 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker CDS 27738 27808 . + 0 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker CDS 27900 27915 . + 1 ID=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1:cds;Parent=augustus_masked-tig00000226|arrow-processed-gene-0.0-mRNA-1 > tig00000226|arrow maker gene 5803 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1;Name=maker-tig00000226|arrow-augustus-gene-0.1 > tig00000226|arrow maker mRNA 5803 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1;Parent=maker-tig00000226|arrow-augustus-gene-0.1;Name=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1;_AED=0.63;_eAED=0.69;_QI=0|0|0|1|0|0|2|0|85 > tig00000226|arrow maker exon 5803 5975 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:exon:62;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 > tig00000226|arrow maker exon 6505 6589 . + . ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:exon:63;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 > tig00000226|arrow maker CDS 5803 5975 . + 0 ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 > > tig00000226|arrow maker CDS 6505 6589 . + 1 ID=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00000226|arrow-augustus-gene-0.1-mRNA-1 > > tig00034405|arrow . contig 1 104941 . . . ID=tig00034405|arrow;Name=tig00034405|arrow > tig00034405|arrow maker gene 40927 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0;Name=maker-tig00034405|arrow-augustus-gene-0.0 > tig00034405|arrow maker mRNA 40927 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1;Parent=maker-tig00034405|arrow-augustus-gene-0.0;Name=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1;_AED=0.04;_eAED=0.04;_QI=266|1|1|1|0|0|3|0|100 > tig00034405|arrow maker exon 40927 41273 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7157;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 > tig00034405|arrow maker exon 41476 41622 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7158;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 > tig00034405|arrow maker exon 50954 51025 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:exon:7159;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 > tig00034405|arrow maker five_prime_UTR 40927 41192 . + . ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:five_prime_utr;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 > tig00034405|arrow maker CDS 41193 41273 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 > tig00034405|arrow maker CDS 41476 41622 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 > tig00034405|arrow maker CDS 50954 51025 . + 0 ID=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.0-mRNA-1 > tig00034405|arrow maker gene 57931 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2;Name=augustus_masked-tig00034405|arrow-processed-gene-0.2 > tig00034405|arrow maker mRNA 57931 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2;Name=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1;_AED=0.13;_eAED=0.13;_QI=0|0.5|0|0.66|1|1|3|0|522 > tig00034405|arrow maker exon 57931 58962 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7160;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 > tig00034405|arrow maker exon 59245 59725 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7161;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 > tig00034405|arrow maker exon 61510 61565 . + . ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:exon:7162;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 > tig00034405|arrow maker CDS 57931 58962 . + 0 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 > tig00034405|arrow maker CDS 59245 59725 . + 0 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 > tig00034405|arrow maker CDS 61510 61565 . + 2 ID=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1:cds;Parent=augustus_masked-tig00034405|arrow-processed-gene-0.2-mRNA-1 > tig00034405|arrow maker gene 90355 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1;Name=maker-tig00034405|arrow-augustus-gene-0.1 > tig00034405|arrow maker mRNA 90355 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1;Parent=maker-tig00034405|arrow-augustus-gene-0.1;Name=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1;_AED=0.28;_eAED=0.28;_QI=0|0.7|0.72|1|1|1|11|386|425 > tig00034405|arrow maker exon 90355 90911 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7173;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 91011 91086 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7172;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 91179 91240 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7171;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 91557 91706 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7170;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 92996 93064 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7169;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 93156 93347 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7168;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 93453 93637 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7167;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 93711 93786 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7166;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 93866 93972 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7165;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 94536 94573 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7164;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker exon 94645 94796 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:exon:7163;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 94645 94796 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 94536 94573 . - 1 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 93866 93972 . - 2 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 93711 93786 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 93453 93637 . - 2 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 93156 93347 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 92996 93064 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 91557 91706 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 91179 91240 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 91011 91086 . - 1 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > tig00034405|arrow maker CDS 90741 90911 . - 0 ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:cds;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > > tig00034405|arrow maker three_prime_UTR 90355 90740 . - . ID=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1:three_prime_utr;Parent=maker-tig00034405|arrow-augustus-gene-0.1-mRNA-1 > _______________________________________________ > maker-devel mailing list > maker-devel at box290.bluehost.com > http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul at tupac.bio Thu Apr 18 03:23:35 2019 From: paul at tupac.bio (Paul Sheridan) Date: Thu, 18 Apr 2019 18:23:35 +0900 Subject: [maker-devel] maker_functional_gff Error Message-ID: Dear MAKER Team, I am running MAKER 2.31.10 a 32 core instance. I followed the Post Processing of Annotations steps as described in the MAKER Tutorial for GMOD Online Training 2014 as best I could, but I get an error when I run maker_functional_gff. The commands in the order of execution and relevant output are shown below. Where did I do wrong? # run blastp command blastp -query genome.all.maker.proteins.fasta -db uniprot_sprot.fasta -num_threads 32 -evalue 1e-6 -max_hsps 1 -max_target_seqs 1 -outfmt 6 -out output.blastp # run interproscan command interproscan.sh -appl pfam -dp -f TSV -goterms -iprlookup -pa -t p -i genome.all.maker.proteins.fasta -o output.iprscan # create naming table maker_map_ids --prefix ThuMac --justify 5 genome.all.gff > genome.all.map # copy files for safe keeping cp genome.all.gff genome.all.renamed.gff cp genome.all.noseq.gff genome.all.noseq.renamed.gff cp genome.all.maker.proteins.fasta genome.all.maker.proteins.renamed.fasta cp genome.all.maker.proteins.aed.0.50.fasta genome.all.maker.proteins.aed.0.50.renamed.fasta cp genome.all.maker.unique.proteins.aed.0.50.fasta genome.all.maker.unique.proteins.aed.0.50.renamed.fasta cp genome.all.maker.transcripts.fasta genome.all.maker.transcripts.renamed.fasta cp genome.all.maker.transcripts.aed.0.50.fasta genome.all.maker.transcripts.aed.0.50.renamed.fasta cp output.iprscan output.renamed.iprscan cp output.blastp output.renamed.blastp # replace uninformative MAKER protein/transcript names with useful ones map_gff_ids genome.all.map genome.all.renamed.gff map_gff_ids genome.all.map genome.all.noseq.renamed.gff map_fasta_ids genome.all.map genome.all.maker.proteins.renamed.fasta map_fasta_ids genome.all.map genome.all.maker.proteins.aed.0.50.renamed.fasta map_fasta_ids genome.all.map genome.all.maker.unique.proteins.aed.0.50.renamed.fasta map_fasta_ids genome.all.map genome.all.maker.transcripts.renamed.fasta map_fasta_ids genome.all.map genome.all.maker.transcripts.aed.0.50.renamed.fasta map_data_ids genome.all.map output.renamed.iprscan map_data_ids genome.all.map output.renamed.blastp # assign annotations maker_functional_gff uniprot_sprot.db output.renamed.blastp genome.all.renamed.gff > genome.all.renamed.putative_function.gff > head output.renamed.blastp ThuMac30929-RA P20036 41.791 134 77 1 326 458 113 246 9.51e-28 114 ThuMac19623-RA P81018 35.714 168 87 2 1 147 1 168 8.40e-33 117 ThuMac19629-RA Q66I51 68.939 264 79 2 1 263 1 262 1.48e-130 372 ThuMac19628-RA Q61464 55.172 87 37 1 766 852 382 466 4.42e-25 119 ThuMac19627-RA P07898 48.276 58 29 1 13 69 1962 2019 3.60e-13 65.9 ThuMac19626-RA P81018 36.782 174 96 2 21 180 1 174 5.75e-36 127 ThuMac19624-RA P81018 35.057 174 99 2 21 180 1 174 2.19e-33 120 ThuMac19625-RA Q28343 32.520 123 43 2 35 117 2123 2245 7.57e-17 78.6 ThuMac19636-RA Q9QX29 90.909 110 10 0 5 114 458 567 6.45e-65 216 ThuMac19638-RA Q9QX29 57.391 115 35 3 5 114 703 808 3.06e-28 120 > head output.renamed.iprscan ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF00520 Ion transport protein 154 413 3.8E-21 T 18-04-2019 IPR005821 Ion transport domain GO:0005216|GO:0006811|GO:0016020|GO:0055085 ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF08412 Ion transport protein N-terminal 109 152 5.1E-19 T 18-04-2019 IPR013621 Ion transport N-terminal Reactome: R-HSA-1296061 ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF00027 Cyclic nucleotide-binding domain 519 601 1.0E-17 T 18-04-2019 IPR000595 Cyclic nucleotide-binding domain ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF13765 SPRY-associated domain 235 283 8.9E-23 T 18-04-2019 IPR006574 SPRY-associated ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF00643 B-box zinc finger 18 56 5.2E-12 T 18-04-2019 IPR000315 B-box-type zinc finger GO:0008270 ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF00622 SPRY domain 287 391 2.2E-14 T 18-04-2019 IPR003877 SPRY domain GO:0005515 ThuMac08369-RA 7aee1da5a47975ab8e43b68bfd1a117c 139 Pfam PF00076 RNA recognition motif. (a.k.a. RRM, RBD, or RNP domain) 22 87 1.6E-15 T 18-04-2019 IPR000504 RNA recognition motif domain GO:0003676 ThuMac26054-RA 8f4119609312bd6442f8bb094c104231 462 Pfam PF07565 Band 3 cytoplasmic domain 173 443 7.3E-100 T 18-04-2019 IPR013769 Band 3 cytoplasmic domain GO:0006820|GO:0008509|GO:0016021 Reactome: R-HSA-425381 ThuMac07958-RA d2b749fa573a5e452cadee56090c9588 804 Pfam PF03372 Endonuclease/Exonuclease/phosphatase family 235 535 7.0E-11 T 18-04-2019 IPR005135 Endonuclease/exonuclease/phosphatase ThuMac07958-RA d2b749fa573a5e452cadee56090c9588 804 Pfam PF17751 SKICH domain 555 649 9.8E-23 T 18-04-2019 IPR041611 SKICH domain > map_data_ids genome.all.map output.renamed.iprscan WARNING: No mapping available for ThuMac01937-RA WARNING: No mapping available for ThuMac02226-RA WARNING: No mapping available for ThuMac20730-RA WARNING: No mapping available for ThuMac20730-RA WARNING: No mapping available for ThuMac14750-RA (Thousands of warnings like these were returned) > maker_functional_gff uniprot_sprot.db output.renamed.blastp genome.all.renamed.gff > genome.all.renamed.putative_function.gff Can't use string ("") as a HASH ref while "strict refs" in use at /root/maker/bin/maker_functional_gff line 55, <$IN> line 3. > head genome.all.renamed.putative_function.gff ##gff-version 3 scf7180000008677_pilon_pilon . contig 1 49996 . . . ID=scf7180000008677_pilon_pilon;Name=scf7180000008677_pilon_pilon Thanks in Advance, Paul Sheridan -- CSO at Tupac Bio Email: paul at tupac.bio Homepage: www.paulsheridan.net Mobile: +81 80 7889 0859 -------------- next part -------------- An HTML attachment was scrubbed... URL: From carsonhh at gmail.com Mon Apr 22 11:50:27 2019 From: carsonhh at gmail.com (Carson Holt) Date: Mon, 22 Apr 2019 11:50:27 -0600 Subject: [maker-devel] maker_functional_gff Error In-Reply-To: References: Message-ID: <4AE21A4F-77F5-4DD8-8614-0D037F7C5209@gmail.com> This ?WARNING: No mapping available for ThuMac01937-RA? means you are running on a file that already has been renamed. The file will have names like maker-SDFGDG-gene-0.1-mRNA-1 for example, and it?s finding the name ThuMac01937-RA, which is not in the first column of the map file. So it throws a warning. The second one ?> Can't use string ("") as a HASH ref while "strict refs" in use at /root/maker/bin/maker_functional_gff line 55, <$IN> line 3. You likely have a trucated line in the GFF3. It?s missing an ID= tag. This can sometimes happen if writing to network mounted (NFS) file systems because of an ansyncrounous IO error. NFS file systems have a performance enhancement where they return SUCCESS on IO operations even and then complete the IO operation later in the background. This improves speed by letting the program advance by not blocking for the IO operation, but it reduces reliability because if the later operation is not really successful, it can?t go back and tell the program ?never mind it failed.? The result is a silent truncation of data. Not super common, but not all that rare either depending on IO load (i.e. heavy MPI with lots of writes). Find the line that?s truncated, then rerun just that contig before building the merged gff3 for everything. ?Carson > On Apr 18, 2019, at 3:23 AM, Paul Sheridan wrote: > > Dear MAKER Team, > > I am running MAKER 2.31.10 a 32 core instance. I followed the Post Processing of Annotations steps as described in the MAKER Tutorial for GMOD Online Training 2014 as best I could, but I get an error when I run maker_functional_gff. The commands in the order of execution and relevant output are shown below. > > Where did I do wrong? > > # run blastp command > blastp -query genome.all.maker.proteins.fasta -db uniprot_sprot.fasta -num_threads 32 -evalue 1e-6 -max_hsps 1 -max_target_seqs 1 -outfmt 6 -out output.blastp > > # run interproscan command > interproscan.sh -appl pfam -dp -f TSV -goterms -iprlookup -pa -t p -i genome.all.maker.proteins.fasta -o output.iprscan > > # create naming table > maker_map_ids --prefix ThuMac --justify 5 genome.all.gff > genome.all.map > > # copy files for safe keeping > cp genome.all.gff genome.all.renamed.gff > cp genome.all.noseq.gff genome.all.noseq.renamed.gff > cp genome.all.maker.proteins.fasta genome.all.maker.proteins.renamed.fasta > cp genome.all.maker.proteins.aed.0.50.fasta genome.all.maker.proteins.aed.0.50.renamed.fasta > cp genome.all.maker.unique.proteins.aed.0.50.fasta genome.all.maker.unique.proteins.aed.0.50.renamed.fasta > cp genome.all.maker.transcripts.fasta genome.all.maker.transcripts.renamed.fasta > cp genome.all.maker.transcripts.aed.0.50.fasta genome.all.maker.transcripts.aed.0.50.renamed.fasta > cp output.iprscan output.renamed.iprscan > cp output.blastp output.renamed.blastp > > # replace uninformative MAKER protein/transcript names with useful ones > map_gff_ids genome.all.map genome.all.renamed.gff > map_gff_ids genome.all.map genome.all.noseq.renamed.gff > map_fasta_ids genome.all.map genome.all.maker.proteins.renamed.fasta > map_fasta_ids genome.all.map genome.all.maker.proteins.aed.0.50.renamed.fasta > map_fasta_ids genome.all.map genome.all.maker.unique.proteins.aed.0.50.renamed.fasta > map_fasta_ids genome.all.map genome.all.maker.transcripts.renamed.fasta > map_fasta_ids genome.all.map genome.all.maker.transcripts.aed.0.50.renamed.fasta > map_data_ids genome.all.map output.renamed.iprscan > map_data_ids genome.all.map output.renamed.blastp > > # assign annotations > maker_functional_gff uniprot_sprot.db output.renamed.blastp genome.all.renamed.gff > genome.all.renamed.putative_function.gff > > > head output.renamed.blastp > ThuMac30929-RA P20036 41.791 134 77 1 326 458 113 246 9.51e-28 114 > ThuMac19623-RA P81018 35.714 168 87 2 1 147 1 168 8.40e-33 117 > ThuMac19629-RA Q66I51 68.939 264 79 2 1 263 1 262 1.48e-130 372 > ThuMac19628-RA Q61464 55.172 87 37 1 766 852 382 466 4.42e-25 119 > ThuMac19627-RA P07898 48.276 58 29 1 13 69 1962 2019 3.60e-13 65.9 > ThuMac19626-RA P81018 36.782 174 96 2 21 180 1 174 5.75e-36 127 > ThuMac19624-RA P81018 35.057 174 99 2 21 180 1 174 2.19e-33 120 > ThuMac19625-RA Q28343 32.520 123 43 2 35 117 2123 2245 7.57e-17 78.6 > ThuMac19636-RA Q9QX29 90.909 110 10 0 5 114 458 567 6.45e-65 216 > ThuMac19638-RA Q9QX29 57.391 115 35 3 5 114 703 808 3.06e-28 120 > > > head output.renamed.iprscan > ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF00520 Ion transport protein 154 413 3.8E-21 T 18-04-2019 IPR005821 Ion transport domain GO:0005216|GO:0006811|GO:0016020|GO:0055085 > ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF08412 Ion transport protein N-terminal 109 152 5.1E-19 T 18-04-2019 IPR013621 Ion transport N-terminal Reactome: R-HSA-1296061 > ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF00027 Cyclic nucleotide-binding domain 519 601 1.0E-17 T 18-04-2019 IPR000595 Cyclic nucleotide-binding domain > ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF13765 SPRY-associated domain 235 283 8.9E-23 T 18-04-2019 IPR006574 SPRY-associated > ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF00643 B-box zinc finger 18 56 5.2E-12 T 18-04-2019 IPR000315 B-box-type zinc finger GO:0008270 > ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF00622 SPRY domain 287 391 2.2E-14 T 18-04-2019 IPR003877 SPRY domain GO:0005515 > ThuMac08369-RA 7aee1da5a47975ab8e43b68bfd1a117c 139 Pfam PF00076 RNA recognition motif. (a.k.a. RRM, RBD, or RNP domain) 22 87 1.6E-15 T 18-04-2019 IPR000504 RNA recognition motif domain GO:0003676 > ThuMac26054-RA 8f4119609312bd6442f8bb094c104231 462 Pfam PF07565 Band 3 cytoplasmic domain 173 443 7.3E-100 T 18-04-2019 IPR013769 Band 3 cytoplasmic domain GO:0006820|GO:0008509|GO:0016021 Reactome: R-HSA-425381 > ThuMac07958-RA d2b749fa573a5e452cadee56090c9588 804 Pfam PF03372 Endonuclease/Exonuclease/phosphatase family 235 535 7.0E-11 T 18-04-2019 IPR005135 Endonuclease/exonuclease/phosphatase > ThuMac07958-RA d2b749fa573a5e452cadee56090c9588 804 Pfam PF17751 SKICH domain 555 649 9.8E-23 T 18-04-2019 IPR041611 SKICH domain > > > map_data_ids genome.all.map output.renamed.iprscan > WARNING: No mapping available for ThuMac01937-RA > WARNING: No mapping available for ThuMac02226-RA > WARNING: No mapping available for ThuMac20730-RA > WARNING: No mapping available for ThuMac20730-RA > WARNING: No mapping available for ThuMac14750-RA > (Thousands of warnings like these were returned) > > > maker_functional_gff uniprot_sprot.db output.renamed.blastp genome.all.renamed.gff > genome.all.renamed.putative_function.gff > Can't use string ("") as a HASH ref while "strict refs" in use at /root/maker/bin/maker_functional_gff line 55, <$IN> line 3. > > > head genome.all.renamed.putative_function.gff > ##gff-version 3 > scf7180000008677_pilon_pilon . contig 1 49996 . . . ID=scf7180000008677_pilon_pilon;Name=scf7180000008677_pilon_pilon > > Thanks in Advance, > > Paul Sheridan > > -- > CSO at Tupac Bio > Email: paul at tupac.bio > Homepage: www.paulsheridan.net > Mobile: +81 80 7889 0859 > _______________________________________________ > maker-devel mailing list > maker-devel at yandell-lab.org > http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org -------------- next part -------------- An HTML attachment was scrubbed... URL: From paul at tupac.bio Sun Apr 28 19:40:12 2019 From: paul at tupac.bio (Paul Sheridan) Date: Mon, 29 Apr 2019 10:40:12 +0900 Subject: [maker-devel] maker_functional_gff Error In-Reply-To: <4AE21A4F-77F5-4DD8-8614-0D037F7C5209@gmail.com> References: <4AE21A4F-77F5-4DD8-8614-0D037F7C5209@gmail.com> Message-ID: Hi Carson, Thanks, your suggestions got me sorted out. Best, Paul On Tue, Apr 23, 2019 at 2:50 AM Carson Holt wrote: > This ?WARNING: No mapping available for ThuMac01937-RA? means you are > running on a file that already has been renamed. The file will have names > like maker-SDFGDG-gene-0.1-mRNA-1 for example, and it?s finding the name > ThuMac01937-RA, which is not in the first column of the map file. So it > throws a warning. > > The second one ?> Can't use string ("") as a HASH ref while "strict refs" > in use at /root/maker/bin/maker_functional_gff line 55, <$IN> line 3. > > > You likely have a trucated line in the GFF3. It?s missing an ID= tag. This > can sometimes happen if writing to network mounted (NFS) file systems > because of an ansyncrounous IO error. NFS file systems have a performance > enhancement where they return SUCCESS on IO operations even and then > complete the IO operation later in the background. This improves speed by > letting the program advance by not blocking for the IO operation, but it > reduces reliability because if the later operation is not really > successful, it can?t go back and tell the program ?never mind it failed.? > The result is a silent truncation of data. Not super common, but not all > that rare either depending on IO load (i.e. heavy MPI with lots of writes). > Find the line that?s truncated, then rerun just that contig before building > the merged gff3 for everything. > > ?Carson > > > > On Apr 18, 2019, at 3:23 AM, Paul Sheridan wrote: > > Dear MAKER Team, > > I am running MAKER 2.31.10 a 32 core instance. I followed the Post > Processing of Annotations steps as described in the MAKER Tutorial for GMOD > Online Training 2014 as best I could, but I get an error when I run > maker_functional_gff. The commands in the order of execution and relevant > output are shown below. > > Where did I do wrong? > > # run blastp command > blastp -query genome.all.maker.proteins.fasta -db uniprot_sprot.fasta > -num_threads 32 -evalue 1e-6 -max_hsps 1 -max_target_seqs 1 -outfmt 6 -out > output.blastp > > # run interproscan command > interproscan.sh -appl pfam -dp -f TSV -goterms -iprlookup -pa -t p -i > genome.all.maker.proteins.fasta -o output.iprscan > > # create naming table > maker_map_ids --prefix ThuMac --justify 5 genome.all.gff > genome.all.map > > # copy files for safe keeping > cp genome.all.gff genome.all.renamed.gff > cp genome.all.noseq.gff genome.all.noseq.renamed.gff > cp genome.all.maker.proteins.fasta genome.all.maker.proteins.renamed.fasta > cp genome.all.maker.proteins.aed.0.50.fasta > genome.all.maker.proteins.aed.0.50.renamed.fasta > cp genome.all.maker.unique.proteins.aed.0.50.fasta > genome.all.maker.unique.proteins.aed.0.50.renamed.fasta > cp genome.all.maker.transcripts.fasta > genome.all.maker.transcripts.renamed.fasta > cp genome.all.maker.transcripts.aed.0.50.fasta > genome.all.maker.transcripts.aed.0.50.renamed.fasta > cp output.iprscan output.renamed.iprscan > cp output.blastp output.renamed.blastp > > # replace uninformative MAKER protein/transcript names with useful ones > map_gff_ids genome.all.map genome.all.renamed.gff > map_gff_ids genome.all.map genome.all.noseq.renamed.gff > map_fasta_ids genome.all.map genome.all.maker.proteins.renamed.fasta > map_fasta_ids genome.all.map > genome.all.maker.proteins.aed.0.50.renamed.fasta > map_fasta_ids genome.all.map > genome.all.maker.unique.proteins.aed.0.50.renamed.fasta > map_fasta_ids genome.all.map genome.all.maker.transcripts.renamed.fasta > map_fasta_ids genome.all.map > genome.all.maker.transcripts.aed.0.50.renamed.fasta > map_data_ids genome.all.map output.renamed.iprscan > map_data_ids genome.all.map output.renamed.blastp > > # assign annotations > maker_functional_gff uniprot_sprot.db output.renamed.blastp > genome.all.renamed.gff > genome.all.renamed.putative_function.gff > > > head output.renamed.blastp > ThuMac30929-RA P20036 41.791 134 77 1 326 458 113 246 9.51e-28 114 > ThuMac19623-RA P81018 35.714 168 87 2 1 147 1 168 8.40e-33 117 > ThuMac19629-RA Q66I51 68.939 264 79 2 1 263 1 262 1.48e-130 372 > ThuMac19628-RA Q61464 55.172 87 37 1 766 852 382 466 4.42e-25 119 > ThuMac19627-RA P07898 48.276 58 29 1 13 69 1962 2019 3.60e-13 65.9 > ThuMac19626-RA P81018 36.782 174 96 2 21 180 1 174 5.75e-36 127 > ThuMac19624-RA P81018 35.057 174 99 2 21 180 1 174 2.19e-33 120 > ThuMac19625-RA Q28343 32.520 123 43 2 35 117 2123 2245 7.57e-17 78.6 > ThuMac19636-RA Q9QX29 90.909 110 10 0 5 114 458 567 6.45e-65 216 > ThuMac19638-RA Q9QX29 57.391 115 35 3 5 114 703 808 3.06e-28 120 > > > head output.renamed.iprscan > ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF00520 Ion > transport protein 154 413 3.8E-21 T 18-04-2019 IPR005821 Ion transport > domain GO:0005216|GO:0006811|GO:0016020|GO:0055085 > ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF08412 Ion > transport protein N-terminal 109 152 5.1E-19 T 18-04-2019 IPR013621 Ion > transport N-terminal Reactome: R-HSA-1296061 > ThuMac08407-RA f1e60af0e3add9ce493bd7a78114da1e 631 Pfam PF00027 Cyclic > nucleotide-binding domain 519 601 1.0E-17 T 18-04-2019 IPR000595 Cyclic > nucleotide-binding domain > ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF13765 SPRY-associated > domain 235 283 8.9E-23 T 18-04-2019 IPR006574 SPRY-associated > ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF00643 B-box > zinc finger 18 56 5.2E-12 T 18-04-2019 IPR000315 B-box-type zinc finger > GO:0008270 > ThuMac24094-RA f3c3ae9be61177558ac12f745bd0dd8e 414 Pfam PF00622 SPRY > domain 287 391 2.2E-14 T 18-04-2019 IPR003877 SPRY domain GO:0005515 > ThuMac08369-RA 7aee1da5a47975ab8e43b68bfd1a117c 139 Pfam PF00076 RNA > recognition motif. (a.k.a. RRM, RBD, or RNP domain) 22 87 1.6E-15 T > 18-04-2019 IPR000504 RNA recognition motif domain GO:0003676 > ThuMac26054-RA 8f4119609312bd6442f8bb094c104231 462 Pfam PF07565 Band 3 > cytoplasmic domain 173 443 7.3E-100 T 18-04-2019 IPR013769 Band 3 > cytoplasmic domain GO:0006820|GO:0008509|GO:0016021 Reactome: R-HSA-425381 > ThuMac07958-RA d2b749fa573a5e452cadee56090c9588 804 Pfam PF03372 Endonuclease/Exonuclease/phosphatase > family 235 535 7.0E-11 T 18-04-2019 IPR005135 > Endonuclease/exonuclease/phosphatase > ThuMac07958-RA d2b749fa573a5e452cadee56090c9588 804 Pfam PF17751 SKICH > domain 555 649 9.8E-23 T 18-04-2019 IPR041611 SKICH domain > > > map_data_ids genome.all.map output.renamed.iprscan > WARNING: No mapping available for ThuMac01937-RA > WARNING: No mapping available for ThuMac02226-RA > WARNING: No mapping available for ThuMac20730-RA > WARNING: No mapping available for ThuMac20730-RA > WARNING: No mapping available for ThuMac14750-RA > (Thousands of warnings like these were returned) > > > maker_functional_gff uniprot_sprot.db output.renamed.blastp > genome.all.renamed.gff > genome.all.renamed.putative_function.gff > Can't use string ("") as a HASH ref while "strict refs" in use at > /root/maker/bin/maker_functional_gff line 55, <$IN> line 3. > > > head genome.all.renamed.putative_function.gff > ##gff-version 3 > scf7180000008677_pilon_pilon . contig 1 49996 . . . > ID=scf7180000008677_pilon_pilon;Name=scf7180000008677_pilon_pilon > > Thanks in Advance, > > Paul Sheridan > > -- > CSO at Tupac Bio > Email: paul at tupac.bio > Homepage: www.paulsheridan.net > Mobile: +81 80 7889 0859 > _______________________________________________ > maker-devel mailing list > maker-devel at yandell-lab.org > http://yandell-lab.org/mailman/listinfo/maker-devel_yandell-lab.org > > > -- CSO at Tupac Bio Email: paul at tupac.bio Homepage: www.paulsheridan.net Mobile: +81 80 7889 0859 -------------- next part -------------- An HTML attachment was scrubbed... URL: From bastian.schiffthaler at umu.se Wed Apr 24 01:43:15 2019 From: bastian.schiffthaler at umu.se (Bastian Schiffthaler) Date: Wed, 24 Apr 2019 07:43:15 -0000 Subject: [maker-devel] Redundant FASTA headers Message-ID: <251d38f5-c15a-6070-fcd9-d6144744885e@umu.se> Hi, I'm running the MPI version of MAKER and I'm supplying seven different trinity assemblies (different experiments) as evidence. Now trinity will not generate unique FASTA headers >across< files, so I'm wondering if there could be an issue with ID collision? What does MAKER use the headers for? Could it create race conditions in temp files? Thanks in advance, Bastian From Christian_jpg2 at hotmail.com Tue Apr 30 12:42:31 2019 From: Christian_jpg2 at hotmail.com (Christian Ayala) Date: Tue, 30 Apr 2019 18:42:31 +0000 Subject: [maker-devel] Running out of time in MAKER Message-ID: Good afternoon, I am trying to annotate some insect genomes using MAKER. MAKER is running in a system that uses a PBS scheduler and has a walltime of 120 hours. So , my jobs are running out of time and are killed before MAKER finishes the annotation. Is there a way to resume a killed MAKER run? Thanks for your help. Best regards, Christian Ayala-Ortiz -------------- next part -------------- An HTML attachment was scrubbed... URL: