[maker-devel] which files are expected after fasta_merge?
Carson Holt
carsonhh at gmail.com
Fri Aug 30 10:48:52 MDT 2019
If you disabled evidence for round 3 (i.e. protein= and est=) then you will get no annotations and EVM will not run. You can look at the GFF3 in a browser, and if you see that there are no protein/est alignments, then that is likely why.
—Carson
> On Aug 15, 2019, at 2:48 PM, Brandon Pickett <pickettbd at gmail.com> wrote:
>
> Good afternoon!
>
> I just finished my third round of maker. I trained snap, augustus, etc. between the rounds. I used fasta_merge and gff3_merge to extract files after each round of maker. gff3_merge performed as expected each time, but fasta_merge surprised me. I will show you which files fasta_merge generated after each round. Please note that, as many people do, I renamed my output files from the default. Accordingly, I will list all the files with a generalized prefix of "maker" and show the rest of the file name as it was generated for me. Also note that I've changed .fasta to .fa for brevity.
>
> After round #1:
> transcripts.fa
> proteins.fa
>
> After round #2:
> non_overlapping_ab_initio.proteins.fa
> non_overlapping_ab_initio.transcripts.fa
> transcripts.fa
> augustus_masked.proteins.fa
> augustus_masked.transcripts.fa
> evm.proteins.fa
> evm.transcripts.fa
> genemark.proteins.fa
> genemark.transcripts.fa
> snap_masked.proteins.fa
> snap_masked.transcripts.fa
> proteins.fa
>
> After round #3:
> non_overlapping_ab_initio.proteins.fa
> non_overlapping_ab_initio.transcripts.fa
> augustus_masked.proteins.fa
> augustus_masked.transcripts.fa
> genemark.proteins.fa
> genemark.transcripts.fa
> snap_masked.proteins.fa
> snap_masked.transcripts.fa
>
> I am unsurprised that I didn't get all these files after round #1 because I used round #1 to generate gene models from transcript evidence. I didn't expect so many files after round #2 (having only seen the output from round #1 up to that point), but it makes sense that I would get output from augustus, evidence modeler (evm), genemark, and snap since I provided them as input to this round (#2) of maker. Between rounds #2 and #3, I re-trained snap and augustus. Genemark was trained between rounds #1 and #2 without gene models from maker and thus did not require re-training. The only difference in my maker control files between rounds #2 and #3 were the paths to the snap and augustus files. In both #2 and #3, the control files had run_evm=1. I can provide my control files for each round, if needed. My question is why transcripts.fa, proteins.fa, evm.proteins.fa, and evm.transcripts.fa were not generated after round #3? I recognize that this is probably not an error, rather a lack of my understanding of when each file is and is not generated.
>
> Thank you,
> Brandon Pickett
>
> _______________________________________________
> maker-devel mailing list
> maker-devel at box290.bluehost.com
> http://box290.bluehost.com/mailman/listinfo/maker-devel_yandell-lab.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://yandell-lab.org/pipermail/maker-devel_yandell-lab.org/attachments/20190830/cf478526/attachment-0003.html>
More information about the maker-devel
mailing list