<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">
<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>
</head>
<body dir="ltr">
<div style="font-family: Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Thank you, Xabi and Carson. </div>
<div style="font-family: Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
With your help, I was able to improve the annotation with a more appropriate number of predictions. </div>
<div style="font-family: Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
<br>
</div>
<div style="font-family: Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Best,</div>
<div style="font-family: Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">
Morgan </div>
<div>
<div id="appendonsend"></div>
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<hr tabindex="-1" style="display:inline-block; width:98%">
<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" color="#000000" style="font-size:11pt"><b>From:</b> Xabier Vázquez-Campos <xvazquezc@gmail.com><br>
<b>Sent:</b> Wednesday, February 6, 2019 11:33 PM<br>
<b>To:</b> morgan sobol; Maker Mailing List<br>
<b>Subject:</b> Re: [maker-devel] Re-annotation, fewer gene predictions</font>
<div> </div>
</div>
<div>
<div dir="ltr">
<div dir="ltr">
<div>
<div>SNAP is easy to train, works well in fungal genomes and it's explained in Maker's wiki:<br>
</div>
<div><a href="http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial_for_WGS_Assembly_and_Annotation_Winter_School_2018#Training_ab_initio_Gene_Predictors">http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/MAKER_Tutorial_for_WGS_Assembly_and_Annotation_Winter_School_2018#Training_ab_initio_Gene_Predictors</a></div>
<div><br>
</div>
</div>
<div>Oh, sorry, I didn't explain myself well. What I was trying to say is that before BUSCO, when we only had CEGMA, we would proceed in a different way to train Augustus as CEGMA wouldn't produce Augustus gene models automatically. I don't mean you to use
CEGMA.</div>
<div><br>
</div>
<div></div>
<div>This is what I have on my own documentation about how to train Augustus "the old way"</div>
<div></div>
<div>
<div id="x_gmail-augustus_old" class="x_gmail-section x_gmail-level5">
<blockquote class="x_gmail_quote" style="margin:0px 0px 0px 0.8ex; border-left:1px solid rgb(204,204,204); padding-left:1ex">
<h5>AUGUSTUS… the old way</h5>
<p>Alternatively, you can train AUGUSTUS in a more “manual” way, like when we were using CEGMA. The training starts with the output from the second instance of
<code>fathom</code> in the <a>SNAP training section</a>.</p>
<pre class="x_gmail-bash"><code class="x_gmail-hljs"><span class="x_gmail-hljs-built_in">cd</span> <span class="x_gmail-hljs-variable">${MYGENOME_DIR}</span>/maker/snap1
perl ~/bin/<a href="http://zff2augustus_gbk.pl">zff2augustus_gbk.pl</a> > <span class="x_gmail-hljs-variable">${MYGENOME}</span>.<a href="http://train1.gb">train1.gb</a></code></pre>
<blockquote>
<p><code><a href="http://zff2augustus_gbk.pl">zff2augustus_gbk.pl</a></code> generates a GenBank file from
<code>export.dna</code>.</p>
</blockquote>
<p>The actual training of AUGUSTUS will be through the <span style="color:rgb(0,0,255)">
<u><a>webAUGUSTUS server</a></u></span>.</p>
<p>Before proceed, it is recommended to rename the fasta headers, specially if they contain special characters and/or very long headers. This is the main reason of failure for the jobs submitted to webAUGUSTUS. You can use the
<a href="http://bioinf.uni-greifswald.de/bioinf/downloads/simplifyFastaHeaders.pl">
<code>simplifyFastaHeaders.pl</code></a> script for that:</p>
<pre class="x_gmail-bash"><code class="x_gmail-hljs">perl ~/bin/simplifyFastaHeaders.pl <span class="x_gmail-hljs-variable">${MYGENOME}</span>_assembly.fasta nameStem <span class="x_gmail-hljs-variable">${MYGENOME}</span>_contigs_rename.fasta <span class="x_gmail-hljs-variable">${MYGENOME}</span>_contigs.map
perl ~/bin/simplifyFastaHeaders.pl <span class="x_gmail-hljs-variable">${MYGENOME}</span>_transcripts_assembled.fasta nameStem <span class="x_gmail-hljs-variable">${MYGENOME}</span>_rna_rename.fasta <span class="x_gmail-hljs-variable">${MYGENOME}</span>_rna.map</code></pre>
<blockquote>
<p><code>nameStem</code> is the base name for naming each of the sequences in the multifasta files. Use a value with something appropriate. Use
<em>contig</em> and <em>rna</em> for the assembly and RNA-seq files, respectively; or something based on that. For example, ‘pgcontig’ and ‘pgrna’ for contigs and RNA from
<em>Puccinia graminis</em><br>
<strong>DO NOT</strong> give the same <code>nameStem</code> to both fasta files, and don’t use any special character.</p>
</blockquote>
<p>We need the following files (minimum):</p>
<ul>
<li><code>${MYGENOME}_assembly.fasta</code> as <em>Genome file</em></li><li><code>${MYGENOME}.<a href="http://train1.gb">train1.gb</a></code> as <em>Training gene structure file</em></li></ul>
<p>If we also have RNA-seq data:</p>
<ul>
<li><code>${MYGENOME}_assembled_transcripts.fasta</code> as <em>cDNA file</em></li></ul>
<p>Use <code>${MYGENOME}_v1</code> as <em>Species name</em>. We will need to have a different species name in the retraining step. Otherwise when Maker2 is rerun, Maker2 will see the same name and will not rerun AUGUSTUS, even though the species profile is
different. So, <code>${MYGENOME}_v1</code> just do the job and tracks version.</p>
<p>Once the job is finished, the <em>Species parameter archive</em> (<code>parameters.tar.gz</code>) will contain a folder with the model files for your species. Copy it to the species folder of your AUGUSTUS installation.</p>
</blockquote>
<div>Hope this helps</div>
<div><br>
</div>
<div>PS: hit reply all so this is logged in Maker's mail list in case anybody else experiences similar issues
<br>
</div>
</div>
</div>
</div>
</div>
<br>
<div class="x_gmail_quote">
<div dir="ltr" class="x_gmail_attr">On Thu, 7 Feb 2019 at 06:36, morgan sobol <<a href="mailto:morgan_starr_s@live.com">morgan_starr_s@live.com</a>> wrote:<br>
</div>
<blockquote class="x_gmail_quote" style="margin:0px 0px 0px 0.8ex; border-left:1px solid rgb(204,204,204); padding-left:1ex">
<div dir="ltr">
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
I have not used SNAP or CEGMA, however, I see that CEGMA was discontinued in 2015. </div>
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
Do you think that will be a problem, or is it still worth using the old version?</div>
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<div>
<div id="x_gmail-m_100415959530892022appendonsend"></div>
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<hr style="display:inline-block; width:98%">
<div id="x_gmail-m_100415959530892022divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" color="#000000" style="font-size:11pt"><b>From:</b> Xabier Vázquez-Campos <<a href="mailto:xvazquezc@gmail.com" target="_blank">xvazquezc@gmail.com</a>><br>
<b>Sent:</b> Tuesday, February 5, 2019 4:42 PM<br>
<b>To:</b> morgan sobol; Maker Mailing List<br>
<b>Subject:</b> Re: [maker-devel] Re-annotation, fewer gene predictions</font>
<div> </div>
</div>
<div>
<div dir="ltr">
<div>Don't you use SNAP? It usually produces quite decent results. And easier to train than any of the other predictors<br>
</div>
<div><br>
</div>
<div>In any case, the Augustus gene model is way off in both cases<br>
</div>
<div>GM doesn't seem bad if your fungus has a rather usual genome... in the first. For the second, it looks bad<br>
</div>
<div><br>
</div>
<div>I'm not too familiar with the reannotation but I'd rather create the gene models from scratch rather than reuse the ones from the Illumina-only genomes.</div>
<div>Note that assemblies with long-reads, have a higher proportion of repetitive elements that need masking and RepeatMasker only may not be enough. In theory, this shouldn't affect Augustus model if trained through BUSCO as it uses defined conserved markers
to create the gene model, but I'm not so sure about GM.</div>
<div><br>
</div>
<div>If you trained Augustus with BUSCO, and this is the result, I'd discard the gene model and train it again by the "traditional way", i.e. as it used to be when we only had CEGMA. I had good results just by changing the training method.</div>
<div><br>
</div>
<div>Hope it helps,</div>
<div>Xabi</div>
<div><br>
</div>
<div><br>
</div>
<div><br>
</div>
</div>
<br>
<div class="x_gmail-m_100415959530892022x_gmail_quote">
<div dir="ltr" class="x_gmail-m_100415959530892022x_gmail_attr">On Wed, 6 Feb 2019 at 02:19, morgan sobol <<a href="mailto:morgan_starr_s@live.com" target="_blank">morgan_starr_s@live.com</a>> wrote:<br>
</div>
<blockquote class="x_gmail-m_100415959530892022x_gmail_quote" style="margin:0px 0px 0px 0.8ex; border-left:1px solid rgb(204,204,204); padding-left:1ex">
<div dir="ltr">
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
Thank you, Xabi for the response. </div>
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
The number of proteins from each source is greatly lower than before. </div>
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
Previous numbers were 325, 10,899, and 11,243 for augustus, genemark, and maker respectively. </div>
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
The more recent numbers are 25, 857, 4418 respectively. </div>
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
So do you think maybe this hints that something is wrong from genemark? </div>
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
Morgan</div>
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<div>
<div id="x_gmail-m_100415959530892022x_gmail-m_5939907033713183898gmail-m_5921756504511149049gmail-m_-973663850771284679appendonsend">
</div>
<div style="font-family:Calibri,Helvetica,sans-serif; font-size:12pt; color:rgb(0,0,0)">
<br>
</div>
<hr style="display:inline-block; width:98%">
<div id="x_gmail-m_100415959530892022x_gmail-m_5939907033713183898gmail-m_5921756504511149049gmail-m_-973663850771284679divRplyFwdMsg" dir="ltr">
<font face="Calibri, sans-serif" color="#000000" style="font-size:11pt"><b>From:</b> Xabier Vázquez-Campos <<a href="mailto:xvazquezc@gmail.com" target="_blank">xvazquezc@gmail.com</a>><br>
<b>Sent:</b> Sunday, February 3, 2019 4:43 PM<br>
<b>To:</b> morgan sobol<br>
<b>Cc:</b> <a href="mailto:maker-devel@yandell-lab.org" target="_blank">maker-devel@yandell-lab.org</a><br>
<b>Subject:</b> Re: [maker-devel] Re-annotation, fewer gene predictions</font>
<div> </div>
</div>
<div>
<div dir="ltr">
<div dir="ltr">
<div>Hi Morgan,</div>
<div><br>
</div>
<div>We had a similar issue with AUGUSTUS underpredicting when using a BUSCO-derived gene model</div>
<div><a href="https://groups.google.com/d/msg/maker-devel/ocnDG4nq1A8/NyCPzzRgAgAJ" target="_blank">https://groups.google.com/d/msg/maker-devel/ocnDG4nq1A8/NyCPzzRgAgAJ</a></div>
<div><br>
</div>
<div>Also, check the number of proteins by each individual predictor. If the numbers from one of them are off, you may find a possible source of issues.</div>
<div>We didn't have a very good experience with GM, as it used to overpredict an absurd number of proteins.</div>
<div><br>
</div>
<div>Xabi<br>
</div>
</div>
</div>
<br>
<div class="x_gmail-m_100415959530892022x_gmail-m_5939907033713183898gmail-m_5921756504511149049gmail-m_-973663850771284679x_gmail_quote">
<div dir="ltr" class="x_gmail-m_100415959530892022x_gmail-m_5939907033713183898gmail-m_5921756504511149049gmail-m_-973663850771284679x_gmail_attr">
On Mon, 4 Feb 2019 at 06:15, morgan sobol <<a href="mailto:morgan_starr_s@live.com" target="_blank">morgan_starr_s@live.com</a>> wrote:<br>
</div>
<blockquote class="x_gmail-m_100415959530892022x_gmail-m_5939907033713183898gmail-m_5921756504511149049gmail-m_-973663850771284679x_gmail_quote" style="margin:0px 0px 0px 0.8ex; border-left:1px solid rgb(204,204,204); padding-left:1ex">
<div>Hello,
<div><br>
</div>
<div>I previously used Maker to annotate two different fungal genomes that were created using Illumina sequences only. For these genomes, I had over 11,000 genes predicted. </div>
<div>I recently obtained PacBio sequences for the same genomes, so I created two hybrid assemblies. Both assemblies were very familiar in length and completed number of orthologs to the Illumina only assembly, but had much fewer, but longer contigs. </div>
<div><br>
</div>
<div>I re-ran Maker using the settings below. For one of my genomes, I got around 11,000 genes predicted again, as expected. However, for the other genome, I am continuously getting ~4,400 predicted genes. </div>
<div><br>
</div>
<div>I am asking for help as to how I can determine why I keep getting fewer predicted genes for only one of my genomes, even though I ran them the same?</div>
<div><br>
</div>
<div>Thanks,</div>
<div>Morgan S. </div>
<div><br>
</div>
<div>maker_opts.log</div>
<div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">#-----Genome (these are always required)</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">genome=/work/Geomicrobiology/msobol/IODP_329_SPG/1368D2H1/repeatmasker/unicycler/1368D_unicycler_contigs.fasta.masked #genome sequence (fasta file or$</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">organism_type=eukaryotic #eukaryotic or prokaryotic. Default is eukaryotic</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56); min-height:13px">
<span style="font-variant-ligatures:no-common-ligatures"></span><br>
</div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">#-----Re-annotation Using MAKER Derived GFF3</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">maker_gff=/work/Geomicrobiology/msobol/IODP_329_SPG/1368D2H1/maker/1368D_2H1_contigs.fasta.maker.output/1368D_2H1_contigs.fasta.all.gff #MAKER derive$</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">est_pass=0 #use ESTs in maker_gff: 1 = yes, 0 = no</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">altest_pass=1 #use alternate organism ESTs in maker_gff: 1 = yes, 0 = no</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">protein_pass=1 #use protein alignments in maker_gff: 1 = yes, 0 = no</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">rm_pass=0 #use repeats in maker_gff: 1 = yes, 0 = no</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">model_pass=0 #use gene models in maker_gff: 1 = yes, 0 = no</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">pred_pass=0 #use ab-initio predictions in maker_gff: 1 = yes, 0 = no</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">other_pass=0 #passthrough anyything else in maker_gff: 1 = yes, 0 = no</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56); min-height:13px">
<span style="font-variant-ligatures:no-common-ligatures"></span><br>
</div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">#-----EST Evidence (for best results provide a file for at least one)</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">est= #set of ESTs or assembled mRNA-seq in fasta format</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">altest= #EST/cDNA sequence file in fasta format from an alternate organism</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">est_gff= #aligned ESTs or mRNA-seq from an external GFF3 file</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">altest_gff= #aligned ESTs from a closly relate species in GFF3 format</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56); min-height:13px">
<span style="font-variant-ligatures:no-common-ligatures"></span><br>
</div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">#-----Protein Homology Evidence (for best results provide a file for at least one)</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">protein=/work/Geomicrobiology/msobol/IODP_329_SPG/uniprot_sprot.fasta #protein sequence file in fasta format (i.e. from mutiple oransisms)</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">protein_gff= #aligned protein homology evidence from an external GFF3 file</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56); min-height:13px">
<span style="font-variant-ligatures:no-common-ligatures"></span><br>
</div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">#-----Repeat Masking (leave values blank to skip repeat masking)</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">model_org= #select a model organism for RepBase masking in RepeatMasker</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">rmlib= #provide an organism specific repeat library in fasta format for RepeatMasker</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">repeat_protein= #provide a fasta file of transposable element proteins for RepeatRunner</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">rm_gff= #pre-identified repeat elements from an external GFF3 file</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">prok_rm=0 #forces MAKER to repeatmask prokaryotes (no reason to change this), 1 = yes, 0 = no</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">softmask=0 #use soft-masking rather than hard-masking in BLAST (i.e. seg and dust filtering)</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56); min-height:13px">
<span style="font-variant-ligatures:no-common-ligatures"></span><br>
</div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">#-----Gene Prediction</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">snaphmm= #SNAP HMM file</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">gmhmm=/home/msobol/genemark/68D_2/output/gmhmm.mod #GeneMark HMM file</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">augustus_species=1368D_uni #Augustus gene prediction species model</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">fgenesh_par_file= #FGENESH parameter file</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">pred_gff= #ab-initio predictions from an external GFF3 file</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">model_gff= #annotated gene models from an external GFF3 file (annotation pass-through)</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">est2genome=0 #infer gene predictions directly from ESTs, 1 = yes, 0 = no</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">protein2genome=1 #infer predictions from protein homology, 1 = yes, 0 = no</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">trna=0 #find tRNAs with tRNAscan, 1 = yes, 0 = no</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">snoscan_rrna= #rRNA file to have Snoscan find snoRNAs</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">unmask=0 #also run ab-initio prediction programs on unmasked sequence, 1 = yes, 0 = no</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56); min-height:13px">
<span style="font-variant-ligatures:no-common-ligatures"></span><br>
</div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">#-----Other Annotation Feature Types (features MAKER doesn't recognize)</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">other_gff= #extra features to pass-through to final MAKER generated GFF3 file</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56); min-height:13px">
<span style="font-variant-ligatures:no-common-ligatures"></span><br>
</div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">#-----External Application Behavior Options</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">alt_peptide=C #amino acid used to replace non-standard amino acids in BLAST databases</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">cpus=1 #max number of cpus to use in BLAST and RepeatMasker (not for MPI, leave 1 when using MPI)</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56); min-height:13px">
<span style="font-variant-ligatures:no-common-ligatures"></span><br>
</div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">#-----MAKER Behavior Options</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">max_dna_len=100000 #length for dividing up contigs into chunks (increases/decreases memory usage)</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">min_contig=1 #skip genome contigs below this length (under 10kb are often useless)</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56); min-height:13px">
<span style="font-variant-ligatures:no-common-ligatures"></span><br>
</div>
</div>
<div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">pred_flank=200 #flank for extending evidence clusters sent to gene predictors</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">pred_stats=1 #report AED and QI statistics for all predictions as well as models</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">AED_threshold=1 #Maximum Annotation Edit Distance allowed (bound by 0 and 1)</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">min_protein=0 #require at least this many amino acids in predicted proteins</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">alt_splice=0 #Take extra steps to try and find alternative splicing, 1 = yes, 0 = no</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">always_complete=0 #extra steps to force start and stop codons, 1 = yes, 0 = no</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">map_forward=0 #map names and attributes forward from old GFF3 genes, 1 = yes, 0 = no</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">keep_preds=1 #Concordance threshold to add unsupported gene prediction (bound by 0 and 1)</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56); min-height:13px">
<span style="font-variant-ligatures:no-common-ligatures"></span><br>
</div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">split_hit=10000 #length for the splitting of hits (expected max intron size for evidence alignments)</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">single_exon=1 #consider single exon EST evidence when generating annotations, 1 = yes, 0 = no</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">single_length=250 #min length required for single exon ESTs if 'single_exon is enabled'</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">correct_est_fusion=0 #limits use of ESTs in annotation to avoid fusion genes</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56); min-height:13px">
<span style="font-variant-ligatures:no-common-ligatures"></span><br>
</div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">tries=2 #number of times to try a contig if there is a failure for some reason</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">clean_try=0 #remove all data from previous run before retrying, 1 = yes, 0 = no</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">clean_up=0 #removes theVoid directory with individual analysis files, 1 = yes, 0 = no</span></div>
<div style="margin:0px; font-size:11px; line-height:normal; font-family:Menlo; color:rgb(148,55,255); background-color:rgb(16,26,56)">
<span style="font-variant-ligatures:no-common-ligatures">TMP= #specify a directory other than the system default temporary directory for temporary files</span></div>
</div>
<div><span style="font-variant-ligatures:no-common-ligatures"><br>
</span></div>
</div>
_______________________________________________<br>
maker-devel mailing list<br>
<a href="mailto:maker-devel@box290.bluehost.com" target="_blank">maker-devel@box290.bluehost.com</a><br>
<a href="http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org" rel="noreferrer" target="_blank">http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org</a><br>
</blockquote>
</div>
<br clear="all">
<br>
-- <br>
<div dir="ltr" class="x_gmail-m_100415959530892022x_gmail-m_5939907033713183898gmail-m_5921756504511149049gmail-m_-973663850771284679x_gmail-m_1929135368719960229gmail_signature">
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>Xabier Vázquez-Campos, <i>PhD</i><br>
<i>Research Associate</i><br>
NSW Systems Biology Initiative<br>
School of Biotechnology and Biomolecular Sciences<br>
The University of New South Wales<br>
Sydney NSW 2052 AUSTRALIA<br>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
</div>
<br clear="all">
<br>
-- <br>
<div dir="ltr" class="x_gmail-m_100415959530892022x_gmail-m_5939907033713183898gmail-m_5921756504511149049gmail_signature">
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>Xabier Vázquez-Campos, <i>PhD</i><br>
<i>Research Associate</i><br>
NSW Systems Biology Initiative<br>
School of Biotechnology and Biomolecular Sciences<br>
The University of New South Wales<br>
Sydney NSW 2052 AUSTRALIA<br>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
</div>
<br clear="all">
<br>
-- <br>
<div dir="ltr" class="x_gmail_signature">
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>Xabier Vázquez-Campos, <i>PhD</i><br>
<i>Research Associate</i><br>
NSW Systems Biology Initiative<br>
School of Biotechnology and Biomolecular Sciences<br>
The University of New South Wales<br>
Sydney NSW 2052 AUSTRALIA<br>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</body>
</html>