<html><head><meta http-equiv="Content-Type" content="text/html; charset=utf-8"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">Once Augustus is trained it will have a new species directory under …/augustus/config/species/ for the organism you just trained. Or if you trained augustus elsewhere (website, BUSCO, etc.) you have to copy the species data there. Then you just supply the species name and Augustus automatically finds it (see Augustus documentation on training).<br class=""><div><br class=""></div><div>For est2genome=1 and protein2genome=1, MAKER takes the alignments from exonerate protein2genome and est2genome and if they are mostly open reading frame, just turns them directly into gene/mRNA/exon/CDS models. If there are none of those in the resulting GFF3 but there are est2genome and protein2genome alignments then all of them have broken ORF. That means there are serious issues with your assembly, or with the est fasta or protein fasta file. For a protein fasta, I recomend using uniprot/swissprot because it is manually curated and contains a broad dataset. But if you cannot get gene models from uniprot/swissprot protein2genome alignments, then your assembly has issues (either too fragmented, lots of errors inducing random stop codons, or lots of N’s interspersed in the sequence).</div><div><br class=""></div><div>—Carson</div><div><br class=""></div><div><br class=""></div><div><br class=""><blockquote type="cite" class=""><div class="">On Oct 8, 2018, at 2:40 PM, Gupta, Parul <<a href="mailto:Parul.Gupta@oregonstate.edu" class="">Parul.Gupta@oregonstate.edu</a>> wrote:</div><br class="Apple-interchange-newline"><div class="">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" class="">
<div style="word-wrap: break-word; -webkit-nbsp-mode: space; line-break: after-white-space;" class="">
I used Augustus to generate training set (separately from maker) based on transcripts (fasta) so how I can use that Augustus generated trained data (hints in gff3 format) in maker for gene prediction? I can see only Augustus species option there in maker_opts.ctl.
Which option I need to turn on in opts.ctl to put Augustus generated hints file? I have augustus.gff as predicted hints.
<div class=""><br class="">
</div>
<div class="">
<blockquote type="cite" class="">est2genome doesn't work with est_gff. You must provide fasta of assembled transcripts. You can revert back to the GFF3 if you want after training.</blockquote>
<div class=""><br class="">
</div>
I used est_fasta not the est_gff.</div>
<div class=""><br class="">
</div>
<div class="">
<blockquote type="cite" class="">Find a contig with protein2genome results in the GFF3 </blockquote>
<br class="">
</div>
<div class="">yes I can see protein2genome results in gff3:</div>
<div class=""><br class="">
</div>
<div class="">
<div style="margin: 0px; font-stretch: normal; font-size: 11px; line-height: normal; font-family: Menlo; background-color: rgb(255, 255, 255);" class="">
<span style="font-variant-ligatures: no-common-ligatures" class="">ScJhAqd_2184%3BHRSCAF%3D3164<span class="Apple-tab-span" style="white-space:pre">
</span></span><span style="font-variant-ligatures: no-common-ligatures; color: #c33720" class=""><b class="">protein2genome</b></span><span style="font-variant-ligatures: no-common-ligatures" class=""><span class="Apple-tab-span" style="white-space:pre">
</span>protein_match<span class="Apple-tab-span" style="white-space:pre"> </span>
31566<span class="Apple-tab-span" style="white-space:pre"> </span>32621<span class="Apple-tab-span" style="white-space:pre">
</span>1426<span class="Apple-tab-span" style="white-space:pre"> </span>+<span class="Apple-tab-span" style="white-space:pre">
</span>.<span class="Apple-tab-span" style="white-space:pre"> </span>ID=ScJhAqd_2184%3BHRSCAF%3D3164:hit:446673;Name=Mlong585_07911-RA;</span></div>
<div style="margin: 0px; font-stretch: normal; font-size: 11px; line-height: normal; font-family: Menlo; background-color: rgb(255, 255, 255);" class="">
<span style="font-variant-ligatures: no-common-ligatures" class="">ScJhAqd_2184%3BHRSCAF%3D3164<span class="Apple-tab-span" style="white-space:pre">
</span></span><span style="font-variant-ligatures: no-common-ligatures; color: #c33720" class=""><b class="">protein2genome</b></span><span style="font-variant-ligatures: no-common-ligatures" class=""><span class="Apple-tab-span" style="white-space:pre">
</span>match_part<span class="Apple-tab-span" style="white-space:pre"> </span>31566<span class="Apple-tab-span" style="white-space:pre">
</span>31775<span class="Apple-tab-span" style="white-space:pre"> </span>1426<span class="Apple-tab-span" style="white-space:pre">
</span>+<span class="Apple-tab-span" style="white-space:pre"> </span>.<span class="Apple-tab-span" style="white-space:pre">
</span>ID=ScJhAqd_2184%3BHRSCAF%3D3164:hsp:1532540;Parent=ScJhAqd_2184%3BHRSCAF%3D3164:hit:446673;Name=Mlong585_07911-RA;Target=Mlong585_07911-RA 82 154;Gap=M14 I3 M56;</span></div>
<div style="margin: 0px; font-stretch: normal; font-size: 11px; line-height: normal; font-family: Menlo; background-color: rgb(255, 255, 255);" class="">
<span style="font-variant-ligatures: no-common-ligatures" class="">ScJhAqd_2184%3BHRSCAF%3D3164<span class="Apple-tab-span" style="white-space:pre">
</span></span><span style="font-variant-ligatures: no-common-ligatures; color: #c33720" class=""><b class="">protein2genome</b></span><span style="font-variant-ligatures: no-common-ligatures" class=""><span class="Apple-tab-span" style="white-space:pre">
</span>match_part<span class="Apple-tab-span" style="white-space:pre"> </span>31872<span class="Apple-tab-span" style="white-space:pre">
</span>32621<span class="Apple-tab-span" style="white-space:pre"> </span>1426<span class="Apple-tab-span" style="white-space:pre">
</span>+<span class="Apple-tab-span" style="white-space:pre"> </span>.<span class="Apple-tab-span" style="white-space:pre">
</span>ID=ScJhAqd_2184%3BHRSCAF%3D3164:hsp:1532541;Parent=ScJhAqd_2184%3BHRSCAF%3D3164:hit:446673;Name=Mlong585_07911-RA;Target=Mlong585_07911-RA 155 409;Gap=M126 I5 M124;</span></div>
<div style="margin: 0px; font-stretch: normal; font-size: 11px; line-height: normal; font-family: Menlo; background-color: rgb(255, 255, 255);" class="">
<span style="font-variant-ligatures: no-common-ligatures" class="">ScJhAqd_2184%3BHRSCAF%3D3164<span class="Apple-tab-span" style="white-space:pre">
</span></span><span style="font-variant-ligatures: no-common-ligatures; color: #c33720" class=""><b class="">protein2genome</b></span><span style="font-variant-ligatures: no-common-ligatures" class=""><span class="Apple-tab-span" style="white-space:pre">
</span>protein_match<span class="Apple-tab-span" style="white-space:pre"> </span>
33816<span class="Apple-tab-span" style="white-space:pre"> </span>35829<span class="Apple-tab-span" style="white-space:pre">
</span>1394<span class="Apple-tab-span" style="white-space:pre"> </span>-<span class="Apple-tab-span" style="white-space:pre">
</span>.<span class="Apple-tab-span" style="white-space:pre"> </span>ID=ScJhAqd_2184%3BHRSCAF%3D3164:hit:446674;Name=Mlong585_12901-RA;</span></div>
<div style="margin: 0px; font-stretch: normal; font-size: 11px; line-height: normal; font-family: Menlo; background-color: rgb(255, 255, 255);" class="">
<span style="font-variant-ligatures: no-common-ligatures" class="">ScJhAqd_2184%3BHRSCAF%3D3164<span class="Apple-tab-span" style="white-space:pre">
</span></span><span style="font-variant-ligatures: no-common-ligatures; color: #c33720" class=""><b class="">protein2genome</b></span><span style="font-variant-ligatures: no-common-ligatures" class=""><span class="Apple-tab-span" style="white-space:pre">
</span>match_part<span class="Apple-tab-span" style="white-space:pre"> </span>34916<span class="Apple-tab-span" style="white-space:pre">
</span>35829<span class="Apple-tab-span" style="white-space:pre"> </span>1394<span class="Apple-tab-span" style="white-space:pre">
</span>-<span class="Apple-tab-span" style="white-space:pre"> </span>.<span class="Apple-tab-span" style="white-space:pre">
</span>ID=ScJhAqd_2184%3BHRSCAF%3D3164:hsp:1532542;Parent=ScJhAqd_2184%3BHRSCAF%3D3164:hit:446674;Name=Mlong585_12901-RA;Target=Mlong585_12901-RA 41 343;Gap=M27 D1 M276 F2;</span></div>
<div style="margin: 0px; font-stretch: normal; font-size: 11px; line-height: normal; font-family: Menlo; background-color: rgb(255, 255, 255);" class="">
<span style="font-variant-ligatures: no-common-ligatures" class="">ScJhAqd_2184%3BHRSCAF%3D3164<span class="Apple-tab-span" style="white-space:pre">
</span></span><span style="font-variant-ligatures: no-common-ligatures; color: #c33720" class=""><b class="">protein2genome</b></span><span style="font-variant-ligatures: no-common-ligatures" class=""><span class="Apple-tab-span" style="white-space:pre">
</span>match_part<span class="Apple-tab-span" style="white-space:pre"> </span>33816<span class="Apple-tab-span" style="white-space:pre">
</span>34182<span class="Apple-tab-span" style="white-space:pre"> </span>1394<span class="Apple-tab-span" style="white-space:pre">
</span>-<span class="Apple-tab-span" style="white-space:pre"> </span>.<span class="Apple-tab-span" style="white-space:pre">
</span>ID=ScJhAqd_2184%3BHRSCAF%3D3164:hsp:1532543;Parent=ScJhAqd_2184%3BHRSCAF%3D3164:hit:446674;Name=Mlong585_12901-RA;Target=Mlong585_12901-RA 344 466;Gap=R2 M123;</span></div>
<div style="margin: 0px; font-stretch: normal; font-size: 11px; line-height: normal; font-family: Menlo; background-color: rgb(255, 255, 255);" class="">
<span style="font-variant-ligatures: no-common-ligatures" class="">ScJhAqd_2184%3BHRSCAF%3D3164<span class="Apple-tab-span" style="white-space:pre">
</span></span><span style="font-variant-ligatures: no-common-ligatures; color: #c33720" class=""><b class="">protein2genome</b></span><span style="font-variant-ligatures: no-common-ligatures" class=""><span class="Apple-tab-span" style="white-space:pre">
</span>protein_match<span class="Apple-tab-span" style="white-space:pre"> </span>
49636<span class="Apple-tab-span" style="white-space:pre"> </span>51466<span class="Apple-tab-span" style="white-space:pre">
</span>1091<span class="Apple-tab-span" style="white-space:pre"> </span>-<span class="Apple-tab-span" style="white-space:pre">
</span>.<span class="Apple-tab-span" style="white-space:pre"> </span>ID=ScJhAqd_2184%3BHRSCAF%3D3164:hit:446675;Name=Mlong585_07901-RA;</span></div>
<div style="margin: 0px; font-stretch: normal; font-size: 11px; line-height: normal; font-family: Menlo; background-color: rgb(255, 255, 255);" class="">
<span style="font-variant-ligatures: no-common-ligatures" class="">ScJhAqd_2184%3BHRSCAF%3D3164<span class="Apple-tab-span" style="white-space:pre">
</span></span><span style="font-variant-ligatures: no-common-ligatures; color: #c33720" class=""><b class="">protein2genome</b></span><span style="font-variant-ligatures: no-common-ligatures" class=""><span class="Apple-tab-span" style="white-space:pre">
</span>match_part<span class="Apple-tab-span" style="white-space:pre"> </span>51354<span class="Apple-tab-span" style="white-space:pre">
</span>51466<span class="Apple-tab-span" style="white-space:pre"> </span>1091<span class="Apple-tab-span" style="white-space:pre">
</span>-<span class="Apple-tab-span" style="white-space:pre"> </span>.<span class="Apple-tab-span" style="white-space:pre">
</span>ID=ScJhAqd_2184%3BHRSCAF%3D3164:hsp:1532544;Parent=ScJhAqd_2184%3BHRSCAF%3D3164:hit:446675;Name=Mlong585_07901-RA;Target=Mlong585_07901-RA 1 36;Gap=M20 D1 M16 F2;</span></div>
<div style="margin: 0px; font-stretch: normal; font-size: 11px; line-height: normal; font-family: Menlo; background-color: rgb(255, 255, 255);" class="">
<span style="font-variant-ligatures: no-common-ligatures" class=""><br class="">
</span></div>
<div style="margin: 0px; font-stretch: normal; font-size: 11px; line-height: normal; font-family: Menlo; background-color: rgb(255, 255, 255);" class="">
<span style="font-family: Helvetica; font-size: 12px;" class="">and est2genome in gff3 as well:</span></div>
<div style="margin: 0px; font-stretch: normal; font-size: 11px; line-height: normal; font-family: Menlo; background-color: rgb(255, 255, 255);" class="">
<br class="">
</div>
<div style="margin: 0px; font-stretch: normal; font-size: 11px; line-height: normal; font-family: Menlo; background-color: rgb(255, 255, 255);" class="">
<div style="margin: 0px; font-stretch: normal; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">ScJhAqd_2184%3BHRSCAF%3D3164<span class="Apple-tab-span" style="white-space:pre">
</span></span><span style="font-variant-ligatures: no-common-ligatures; color: #c33720" class=""><b class="">est2genome</b></span><span style="font-variant-ligatures: no-common-ligatures" class=""><span class="Apple-tab-span" style="white-space:pre">
</span>expressed_sequence_match<span class="Apple-tab-span" style="white-space:pre">
</span>48887305<span class="Apple-tab-span" style="white-space:pre"> </span>48890708<span class="Apple-tab-span" style="white-space:pre">
</span>16239<span class="Apple-tab-span" style="white-space:pre"> </span>+<span class="Apple-tab-span" style="white-space:pre">
</span>.<span class="Apple-tab-span" style="white-space:pre"> </span>ID=ScJhAqd_2184%3BHRSCAF%3D3164:hit:547163;Name=Sh_Salba_v2_61181;</span></div>
<div style="margin: 0px; font-stretch: normal; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">ScJhAqd_2184%3BHRSCAF%3D3164<span class="Apple-tab-span" style="white-space:pre">
</span></span><span style="font-variant-ligatures: no-common-ligatures; color: #c33720" class=""><b class="">est2genome</b></span><span style="font-variant-ligatures: no-common-ligatures" class=""><span class="Apple-tab-span" style="white-space:pre">
</span>match_part<span class="Apple-tab-span" style="white-space:pre"> </span>48887305<span class="Apple-tab-span" style="white-space:pre">
</span>48889881<span class="Apple-tab-span" style="white-space:pre"> </span>16239<span class="Apple-tab-span" style="white-space:pre">
</span>+<span class="Apple-tab-span" style="white-space:pre"> </span>.<span class="Apple-tab-span" style="white-space:pre">
</span>ID=ScJhAqd_2184%3BHRSCAF%3D3164:hsp:1871792;Parent=ScJhAqd_2184%3BHRSCAF%3D3164:hit:547163;Name=Sh_Salba_v2_61181;Target=Sh_Salba_v2_61181 1 2590 +;Gap=M285 D1 M288 I10 M5 I4 M1998;</span></div>
<div style="margin: 0px; font-stretch: normal; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">ScJhAqd_2184%3BHRSCAF%3D3164<span class="Apple-tab-span" style="white-space:pre">
</span></span><span style="font-variant-ligatures: no-common-ligatures; color: #c33720" class=""><b class="">est2genome</b></span><span style="font-variant-ligatures: no-common-ligatures" class=""><span class="Apple-tab-span" style="white-space:pre">
</span>match_part<span class="Apple-tab-span" style="white-space:pre"> </span>48889982<span class="Apple-tab-span" style="white-space:pre">
</span>48890708<span class="Apple-tab-span" style="white-space:pre"> </span>16239<span class="Apple-tab-span" style="white-space:pre">
</span>+<span class="Apple-tab-span" style="white-space:pre"> </span>.<span class="Apple-tab-span" style="white-space:pre">
</span>ID=ScJhAqd_2184%3BHRSCAF%3D3164:hsp:1871793;Parent=ScJhAqd_2184%3BHRSCAF%3D3164:hit:547163;Name=Sh_Salba_v2_61181;Target=Sh_Salba_v2_61181 2591 3317 +;Gap=M727;</span></div>
<div style="margin: 0px; font-stretch: normal; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">ScJhAqd_2184%3BHRSCAF%3D3164<span class="Apple-tab-span" style="white-space:pre">
</span></span><span style="font-variant-ligatures: no-common-ligatures; color: #c33720" class=""><b class="">est2genome</b></span><span style="font-variant-ligatures: no-common-ligatures" class=""><span class="Apple-tab-span" style="white-space:pre">
</span>expressed_sequence_match<span class="Apple-tab-span" style="white-space:pre">
</span>48887305<span class="Apple-tab-span" style="white-space:pre"> </span>48890708<span class="Apple-tab-span" style="white-space:pre">
</span>16412<span class="Apple-tab-span" style="white-space:pre"> </span>+<span class="Apple-tab-span" style="white-space:pre">
</span>.<span class="Apple-tab-span" style="white-space:pre"> </span>ID=ScJhAqd_2184%3BHRSCAF%3D3164:hit:547164;Name=Sh_Salba_v2_61182;</span></div>
<div style="margin: 0px; font-stretch: normal; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">ScJhAqd_2184%3BHRSCAF%3D3164<span class="Apple-tab-span" style="white-space:pre">
</span></span><span style="font-variant-ligatures: no-common-ligatures; color: #c33720" class=""><b class="">est2genome</b></span><span style="font-variant-ligatures: no-common-ligatures" class=""><span class="Apple-tab-span" style="white-space:pre">
</span>match_part<span class="Apple-tab-span" style="white-space:pre"> </span>48887305<span class="Apple-tab-span" style="white-space:pre">
</span>48889881<span class="Apple-tab-span" style="white-space:pre"> </span>16412<span class="Apple-tab-span" style="white-space:pre">
</span>+<span class="Apple-tab-span" style="white-space:pre"> </span>.<span class="Apple-tab-span" style="white-space:pre">
</span>ID=ScJhAqd_2184%3BHRSCAF%3D3164:hsp:1871794;Parent=ScJhAqd_2184%3BHRSCAF%3D3164:hit:547164;Name=Sh_Salba_v2_61182;Target=Sh_Salba_v2_61182 1 2590 +;Gap=M285 D1 M288 I10 M5 I4 M1998;</span></div>
<div style="margin: 0px; font-stretch: normal; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">ScJhAqd_2184%3BHRSCAF%3D3164<span class="Apple-tab-span" style="white-space:pre">
</span></span><span style="font-variant-ligatures: no-common-ligatures; color: #c33720" class=""><b class="">est2genome</b></span><span style="font-variant-ligatures: no-common-ligatures" class=""><span class="Apple-tab-span" style="white-space:pre">
</span>match_part<span class="Apple-tab-span" style="white-space:pre"> </span>48889949<span class="Apple-tab-span" style="white-space:pre">
</span>48890708<span class="Apple-tab-span" style="white-space:pre"> </span>16412<span class="Apple-tab-span" style="white-space:pre">
</span>+<span class="Apple-tab-span" style="white-space:pre"> </span>.<span class="Apple-tab-span" style="white-space:pre">
</span>ID=ScJhAqd_2184%3BHRSCAF%3D3164:hsp:1871795;Parent=ScJhAqd_2184%3BHRSCAF%3D3164:hit:547164;Name=Sh_Salba_v2_61182;Target=Sh_Salba_v2_61182 2591 3350 +;Gap=M760;</span></div>
<div style="margin: 0px; font-stretch: normal; line-height: normal;" class=""><span style="font-variant-ligatures: no-common-ligatures" class="">ScJhAqd_2184%3BHRSCAF%3D3164<span class="Apple-tab-span" style="white-space:pre">
</span></span><span style="font-variant-ligatures: no-common-ligatures; color: #c33720" class=""><b class="">est2genome</b></span><span style="font-variant-ligatures: no-common-ligatures" class=""><span class="Apple-tab-span" style="white-space:pre">
</span>expressed_sequence_match<span class="Apple-tab-span" style="white-space:pre">
</span>48895479<span class="Apple-tab-span" style="white-space:pre"> </span>48899036<span class="Apple-tab-span" style="white-space:pre">
</span>9582<span class="Apple-tab-span" style="white-space:pre"> </span>+<span class="Apple-tab-span" style="white-space:pre">
</span>.<span class="Apple-tab-span" style="white-space:pre"> </span>ID=ScJhAqd_2184%3BHRSCAF%3D3164:hit:547165;Name=Sh_Salba_v2_108280;</span></div>
</div>
<div class="">
<div class=""><br class="">
</div>
<div class="">Thanks,</div>
<div class="">Parul</div>
<br class="">
<blockquote type="cite" class="">
<div class="">On Oct 8, 2018, at 3:11 PM, Carson Holt <<a href="mailto:carsonhh@gmail.com" class="">carsonhh@gmail.com</a>> wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<div class=""><br class="">
<blockquote type="cite" class="">We had run BUSCO and there is no problem in genome assembly. I used RepeatMasker (separately from maker pipeline) for masking the repeats using custom generated library (denovo repeats and repeat library from other species as
well). The masked genome was used as input in maker_opts.ctl.<br class="">
</blockquote>
<br class="">
Let MAKER run masking if possible. Also BUSCO can be used to train Augustus which can then become the gene predictor in MAKER.<br class="">
<br class="">
<br class="">
<blockquote type="cite" class="">Transcripts-<br class="">
We have RNA-Seq data assembled using velvet /oases from the same species as for genome sequenced. I globally aligned transcripts over assembled genome using GMAP with gave ~99% mapping. Gff3 generated from GMAP was also checked on genome browser. Those transcripts
were used as est input in maker_opts.ctl. These assembled transcripts may have redundancy.<br class="">
</blockquote>
<br class="">
est2genome doesn't work with est_gff. You must provide fasta of assembled transcripts. You can revert back to the GFF3 if you want after training.<br class="">
<br class="">
<br class="">
<blockquote type="cite" class="">Proteins-<br class="">
I used protein (fasta seq) sequences downloaded from uniprot for 5 closely related species and one from in-house sequenced genome (already published). Protein sequences from all 6 organisms are concatenated in one file and used as protein evidence in maker_opts.ctl.<br class="">
</blockquote>
<br class="">
Look at the contigs in a browser. Find a contig with protein2genome results in the GFF3 (i.e. the column is marked protein2genome in the GFF3), and look at it specifically. If you don’t find any, then the issue is either your pre-masking or the evidence proteins
you gave. I’d recommend using UniProt/Swiss-Prot which conains a broad set of curated and conserved proteins.<br class="">
<br class="">
<br class="">
<blockquote type="cite" class="">atleast=transcripts.fasta (from in-house sequenced genome (already published))<br class="">
</blockquote>
<br class="">
These will being ignored until you have a trained HMM (this type of alignment can only be used as hints to the trained predictor).<br class="">
<br class="">
—Carson<br class="">
<br class="">
</div>
</div>
</blockquote>
</div>
<br class="">
</div>
</div>
</div></blockquote></div><br class=""></body></html>