<div dir="ltr">That's great! Thanks for the tips Carson. <div><br></div><div>Urmi</div><div class="gmail_extra"><br><div class="gmail_quote">On Fri, Mar 23, 2018 at 5:28 PM, Carson Holt <span dir="ltr"><<a href="mailto:carsonhh@gmail.com" target="_blank">carsonhh@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word;line-break:after-white-space">Run A —> no gene prediction, just cut and paste of transcript/protein alignments to generate rough models.<div>Run B —> Gene predictions based on training using only highly conserved subset of genes (you will have low sensitivity)</div><div><div>Run C —> Gene predictions based on training using broader gene set. Higher sensitivity but potentially lower specificity (sensitivity gains should outweigh any specificity loss).</div><div><br></div><div>Finally, mnake sure you look at models in a browser to see how well evidence and models overlap. If gene fusion is an issue (falsely merged mRNA-seq assembly results will generate hints that can cause gene predictors to fuse gene models), try deFusion —> <a href="https://wjidea.github.io/defusion/installation.html" target="_blank">https://wjidea.github.io/<wbr>defusion/installation.html</a></div><div><br></div><div>—Carson</div><div><br></div><div><br></div><div><br><blockquote type="cite"><div><div class="h5"><div>On Mar 21, 2018, at 3:05 AM, Urmi <<a href="mailto:urmi208@gmail.com" target="_blank">urmi208@gmail.com</a>> wrote:</div><br class="m_3476330291946200717Apple-interchange-newline"></div></div><div><div><div class="h5"><div dir="ltr"><p style="box-sizing:border-box;margin:0px 0px 10px;color:rgb(51,51,51);font-family:sans-serif;font-size:13px;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial">Hello maker community,</p><p style="box-sizing:border-box;margin:0px 0px 10px;color:rgb(51,51,51);font-family:sans-serif;font-size:13px;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial">I am trying to run maker 3.01.02-beta on a fungal genome. I am using available EST and protein sequences from a different strain of the same species using parameters "est" and "protein" in the maker_opts.ctl file. Here is the protocol I am using:</p><ol style="box-sizing:border-box;margin-top:0px;margin-bottom:10px;color:rgb(51,51,51);font-family:sans-serif;font-size:13px;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial"><li style="box-sizing:border-box">Run maker with repeat masking and providing transcript and protein sequences from related species (Run A)</li><li style="box-sizing:border-box">Create SNAP model with CEGMA</li><li style="box-sizing:border-box">Train Augustus with BUSCO</li><li style="box-sizing:border-box">Run (run B ) with the new SNAP (done at step 2) and augustus species with options turned off (est2genome=0) and (protein2genome=0) data, provide gff file (altest_gff=runA_cdna2genome.<wbr>gff, protein_gff=runA_<wbr>protein2genome.gff3)</li><li style="box-sizing:border-box">Create SNAP model from run B.</li><li style="box-sizing:border-box">Train Augustus with transcripts from run B and BUSCO</li><li style="box-sizing:border-box">Run (run C ) with the new SNAP (done at step 5) and augustus species with options turned off (est2genome=0) and (protein2genome=0) data, provide gff file (altest_gff=runA_cdna2genome.<wbr>gff, protein_gff=runA_<wbr>protein2genome.gff3), keep_preds=1</li></ol><p style="box-sizing:border-box;margin:0px 0px 10px;color:rgb(51,51,51);font-family:sans-serif;font-size:13px;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial">As a result of this, I get following gene numbers:</p><ul style="box-sizing:border-box;margin-top:0px;margin-bottom:10px;color:rgb(51,51,51);font-family:sans-serif;font-size:13px;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial"><li style="box-sizing:border-box">run A: 12796 total genes out of which 12771 have AED < 0.5</li><li style="box-sizing:border-box">run B:10713 total genes out of which 10701 have AED < 0.5</li><li style="box-sizing:border-box">run C: 12651 total genes out of which 12582 have AED < 0.5</li></ul><span style="color:rgb(51,51,51);font-family:sans-serif;font-size:13px">Looking at the gff files in detail, it is observerd that there are some gene models in run A which are lost in run B and gain in run C. <span style="color:rgb(51,51,51);font-family:sans-serif;font-size:13px;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial;float:none;display:inline">I don't understand why there is gene loss for run B.<span> Here is an example:</span></span></span><div><font color="#333333" face="sans-serif"><br></font></div><div><font color="#333333" face="sans-serif"><b>RunA</b><br></font><div><font color="#333333" face="sans-serif"><br></font></div><div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><font color="#333333" face="sans-serif"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">contig1 maker gene 20468 21193 . + . ID=maker-contig1-exonerate_<wbr>protein2genome-gene-0.34;Name=<wbr>maker-contig1-exonerate_<wbr>protein2genome-gene-0.34</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">contig1 maker mRNA 20468 21193 100 + . ID=maker-contig1-exonerate_<wbr>protein2genome-gene-0.34-mRNA-<wbr>1;Parent=maker-contig1-<wbr>exonerate_protein2genome-gene-<wbr>0.34;Name=maker-contig1-<wbr>exonerate_protein2genome-gene-<wbr>0.34-mRNA-1;_AED=0.30;_eAED=0.<wbr>30;_QI=0|-1|0|1|-1|0|1|0|241</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">contig1 maker exon 20468 21193 . + . ID=maker-contig1-exonerate_<wbr>protein2genome-gene-0.34-mRNA-<wbr>1:1;Parent=maker-contig1-<wbr>exonerate_protein2genome-gene-<wbr>0.34-mRNA-1</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">contig1 maker CDS 20468 21193 . + 0 ID=maker-contig1-exonerate_<wbr>protein2genome-gene-0.34-mRNA-<wbr>1:cds;Parent=maker-contig1-<wbr>exonerate_protein2genome-gene-<wbr>0.34-mRNA-1</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">contig1 blastn expressed_sequence_match 20468 21193 726 + . ID=contig1:hit:983:3.2.0.0;<wbr>Name=jgi|test_1|140804|est target_length=726</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">contig1 blastn match_part 20468 21193 726 + . ID=contig1:hsp:998:3.2.0.0;<wbr>Parent=contig1:hit:983:3.2.0.<wbr>0;Target=jgi|test_1|140804|est 1 726 +;Gap=M726</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">contig1 est2genome expressed_sequence_match 20468 21193 3630 + . ID=contig1:hit:1022:3.2.0.0;<wbr>Name=jgi|test_1|140804|est;<wbr>target_length=726;aligned_<wbr>coverage=100;aligned_identity=<wbr>100</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">contig1 est2genome match_part 20468 21193 3630 + . ID=contig1:hsp:1110:3.2.0.0;<wbr>Parent=contig1:hit:1022:3.2.0.<wbr>0;Target=jgi|test_1|140804|est 1 726 +;Gap=M726</blockquote></font></blockquote><div><div><font color="#333333" face="sans-serif"><br></font></div><div><font color="#333333" face="sans-serif"><b>RunB:</b></font></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><font color="#333333" face="sans-serif"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">contig1 est_gff:est2genome expressed_sequence_match 20468 21193 3630 + . ID=contig1:hit:1051:3.12.0.0;<wbr>Name=jgi|test_1|140804|est;<wbr>target_length=726;aligned_<wbr>coverage=100;aligned_identity=<wbr>100;aligned_coverage=100;<wbr>aligned_identity=100;score=<wbr>3630;target_length=726</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">contig1 est_gff:est2genome match_part 20468 21193 3630 + . ID=contig1:hsp:1166:3.12.0.0;<wbr>Parent=contig1:hit:1051:3.12.<wbr>0.0;Target=jgi|test_1|140804|<wbr>est 1 726 +;Gap=M726</blockquote></font></blockquote><div><br></div><div><b>RunC: </b></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">contig1 maker gene 20468 21193 . + . ID=snap_masked-contig1-<wbr>processed-gene-0.5;Name=snap_<wbr>masked-contig1-processed-gene-<wbr>0.5</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">contig1 maker mRNA 20468 21193 . + . ID=snap_masked-contig1-<wbr>processed-gene-0.5-mRNA-1;<wbr>Parent=snap_masked-contig1-<wbr>processed-gene-0.5;Name=snap_<wbr>masked-contig1-processed-gene-<wbr>0.5-mRNA-1;_AED=0.30;_eAED=0.<wbr>30;_QI=0|-1|0|1|-1|1|1|0|241;_<wbr>merge_warning=1</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">contig1 maker exon 20468 21193 . + . ID=snap_masked-contig1-<wbr>processed-gene-0.5-mRNA-1:1;<wbr>Parent=snap_masked-contig1-<wbr>processed-gene-0.5-mRNA-1</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">contig1 maker CDS 20468 21193 . + 0 ID=snap_masked-contig1-<wbr>processed-gene-0.5-mRNA-1:cds;<wbr>Parent=snap_masked-contig1-<wbr>processed-gene-0.5-mRNA-1</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">contig1 snap_masked match 20468 21193 42.956 + . ID=contig1:hit:5240:4.5.0.0;<wbr>Name=snap_masked-contig1-<wbr>abinit-gene-0.5-mRNA-1;target_<wbr>length=4075195</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">contig1 snap_masked match_part 20468 21193 42.956 + . ID=contig1:hsp:12911:4.5.0.0;<wbr>Parent=contig1:hit:5240:4.5.0.<wbr>0;Target=snap_masked-contig1-<wbr>abinit-gene-0.5-mRNA-1 1 726 +;Gap=M726</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">contig1 est_gff:est2genome expressed_sequence_match 20468 21193 3630 + . ID=contig1:hit:1051:3.12.0.0;<wbr>Name=jgi|test_1|140804|est;<wbr>target_length=726;aligned_<wbr>coverage=100;aligned_identity=<wbr>100;aligned_coverage=100;<wbr>aligned_identity=100;score=<wbr>3630;target_length=726</blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">contig1 est_gff:est2genome match_part 20468 21193 3630 + . ID=contig1:hsp:1166:3.12.0.0;<wbr>Parent=contig1:hit:1051:3.12.<wbr>0.0;Target=jgi|test_1|140804|<wbr>est 1 726 +;Gap=M726</blockquote></blockquote><div><font color="#333333" face="sans-serif"><br></font></div><div><span style="color:rgb(51,51,51);font-family:sans-serif;font-size:13px">Please could anyone shed come light on this?</span><br></div><p style="box-sizing:border-box;margin:0px 0px 10px;color:rgb(51,51,51);font-family:sans-serif;font-size:13px;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial"><br></p><p style="box-sizing:border-box;margin:0px 0px 10px;color:rgb(51,51,51);font-family:sans-serif;font-size:13px;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial">Many thanks in advance.</p><p style="box-sizing:border-box;margin:0px 0px 10px;color:rgb(51,51,51);font-family:sans-serif;font-size:13px;font-style:normal;font-variant-ligatures:normal;font-variant-caps:normal;font-weight:400;letter-spacing:normal;text-align:start;text-indent:0px;text-transform:none;white-space:normal;word-spacing:0px;background-color:rgb(255,255,255);text-decoration-style:initial;text-decoration-color:initial">Urmi</p></div></div></div></div></div></div>
______________________________<wbr>_________________<br>maker-devel mailing list<br><a href="mailto:maker-devel@box290.bluehost.com" target="_blank">maker-devel@box290.bluehost.<wbr>com</a><br><a href="http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org" target="_blank">http://box290.bluehost.com/<wbr>mailman/listinfo/maker-devel_<wbr>yandell-lab.org</a><br></div></blockquote></div></div></div></blockquote></div><br>
</div></div>